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Foreword 

Advances  in  hardware  technologies  have  led  to  extensive  use  of  sophisticated 
processors  to  build  multiprocessors  and  distributed  processors  for  a  variety  of 
real-time  systems  such  as  home  appliances,  flexible  manufacturing  systems,  process 
control  systems,  flight  control  systems  and  tactical  control  systems.  This  in  turn  has 
led  to  challenges  in  programming  distributed,  parallel  and  real-time  systems.  While 
distributed  programming  has  been  based  on  the  paradigm  of  achieving  a  common 
goal  from  partial  knowledge  (information),  the  principle  objective  of  parallel  processing 
has  been  to  achieve  better  efficiency  and  throughput.  Even  though  concurrency  has 
been  investigated  for  many  years,  it  is  only  recently  that  a  firm  formal  foundation 
of  real-time  programming  has  been  established.  A  firm  mathematical  foundation  has 
made  it  possible  to  separate  the  issues  of  functional  and  timing  correctness  of  real-time 
programs.  Such  a  separation  in  turn  has  led  to  a  serious  investigation  of  design, 
methodology  and  standardization. 

The  primary  purpose  of  this  book  is  to  present: 

1.  issues  and  principles  of  specification  and  verification  of  real-time  distributed 
programs; 

2.  issues  of  true  concurrency; 

3.  study  of  epistemology  (or  the  study  of  knowledge)  and  distributed  systems;  and 

4.  issues  of  building  and  programming  scalable  concurrent  computers. 

The  book  is  essentially  in  two  parts:  the  first  part  is  devoted  to  issues  of  real-time 
programming  and  the  latter  part  covers  various  aspects  of  true  concurrency,  epistemology 
and  scalable  concurrent  computing. 

The  first  part  has  four  chapters.  The  first  chapter  Modelling  real-time  systems:  Issues 
and  challenges  by  R  K  Shyamasundar  and  S  Ramesh  surveys  the  issues  and  challenges 
that  lie  in  the  specification,  development,  and  verification  of  real-time  systems.  This 
section  mainly  emphasizes  the  issues  of  real-time  distributed  concurrency. 

Chapter  2  by  J  J  M  Hooman  and  W  P  de  Roever  provides  an  introduction  to 
compositional  methods  for  concurrency  and  their  application  to  real-time.  This  section 
is  devoted  to  a  discussion  of  formal  methods  to  specify  and  verify  concurrent  programs 
with  synchronous  message  passing.  The  emphasis  is  mainly  on  compositional  methods, 
i.e.,  methods  in  which  the  specification  of  a  compound  program  can  be  inferred  from 
specifications  of  its  constituents  without  reference  to  the  internal  structure  of  those 
parts.  Compositionality  enables  verification  during  the  process  of  top-down  design 
(the  derivation  of  correct  programs)  instead  of  the  more  familiar  a  posteriori 
verification  based  on  an  already  completed  program  code.  The  chapter  also  highlights 
the  main  principles  behind  compositionality  by  discussing  transitions  from  non- 
compositional  methods  to  compositional  methods  for  concurrent  programs. 

Priority  specification  plays  a  vital  role  in  the  development  of  predictable  systems. 
This  is  discussed  in  the  next  chapter  by  R  K  Shyamasundar  and  L  Y  Liu.  The  notion 
of  priority  is  based  on  the  intuition  that  a  low  priority  action  can  proceed  only  if  the 
high  priority  action  cannot  proceed  due  to  lack  of  a  handshaking  partner  at  that 
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point  of  execution.  The  authors  discuss  the  issues  of  compositional  specification  of 
prioritized  real-time  distributed  programming  languages  and  describe  an  approach 
wherein  one  can  preserve  compositionality  without  placing  unnecessary  restrictions 
between  prioritized  events  (local  or  global)  and  unprioritized  events  (local  or  global). 

Chapter  4  by  G  Berry  is  concerned  with  ESTEREL,  which  is  one  of  the  highly 
developed  languages  of  the  class  of  synchronous  concurrent  languages  dedicated  to 
reactive  systems.  ESTEREL  can  be  distinguished  from  the  languages  discussed  in  the 
previous  chapters  by  the  fact  that  it  is  based  on  the  notion  of  a  perfect  real-time 
machine  (i.e.,  synchrony  hypothesis,  wherein  control  and  communication  are  assumed 
to  be  taking  no  time).  Berry  discusses  a  hardware  implementation  of  a  subset  of 
ESTEREL.  In  the  translation  described,  each  program  generates  a  specific  circuit  that 
responds  to  any  input  in  one  clock  cycle.  It  is  shown  that  whenever  the  source  program 
satisfies  some  statically  verifiable  dynamic  properties,  the  circuit  is  semantically 
equivalent  to  the  source  program.  It  is  of  interest  to  note  that  the  translation  has 
been  effectively  implemented  on  the  programmable  active  memory  PerleO  developed 
by  J  Vuillemin  and  his  group  at  Digital  Equipment. 

The  second  part  begins  with  a  survey  of  models  and  logics  for  true  concurrency 
by  Kamal  Lodaya,  Madhavan  Mukund,  R  Ramanujam  and  P  S  Thiagarajan.  In  this 
chapter,  the  authors  first  survey  formal  models  of  distributed  systems  in  which 
concurrency  is  specified  explicitly,  in  contrast  to  more  traditional  approaches  where 
concurrency  is  represented  implicitly  as  a  nondeterministic  choice  between  all  possible 
sequentializations  of  concurrent  actions.  In  the  second  half  of  the  presentation,  the 
authors  develop  a  family  of  logics  to  specify  and  reason  about  the  behavioural 
properties  of  the  models  described  in  the  first  part.  The  logics  defined  are  extensions 
of  temporal  logic  with  new  modalities  to  directly  describe  concurrency. 

Chapter  6  by  Rohit  Parikh  and  Paul  Krasucki  is  devoted  to  epistemology  or  the 
study  of  knowledge.  In  this  chapter,  the  authors  define  various  notions  of  the  study 
of  knowledge  and  distributed  systems  and  introduce  the  notion  of  levels  of  knowledge. 
The  authors  discuss  how  levels  of  knowledge  can  be  realized  in  distributed  systems 
and  arrive  at  protocols  that  precisely  realize  levels  of  knowledge  of  some  formulae. 

In  the  last  chapter,  Nalini  Venkatasubramanian,  Shakuntala  Miriyala  and  Gul  Agha 
focus  on  the  challenges  in  building  and  programming  scalable  concurrent  computers. 
The  chapter  brings  out  the  inadequacy  of  current  models  of  computing  for  programming 
massively  parallel  computers  and  discuss  three  universal  models  of  concurrent 
computing,  developed  respectively  by  programming-,  architecture-  and  algorithmic- 
perspectives.  It  is  shown  that  these  models  provide  a  powerful  representation  for 
parallel  computing  and  are  shown  to  be  quite  close.  The  authors  also  argue  that  by 
using  a  flexible  universal  programming  model,  an  environment  supporting  hetero¬ 
geneous  programming  languages  can  be  developed. 

Thanks  go  to  Professors  R  Narasimha  and  N  Viswanadham  for  their  enthusiasm 
in  bringing  out  this  book  which  is  essentially  the  result  of  their  ideas  and 
encouragement.  I  also  thank  all  the  authors  for  the  articles  and  their  cooperation  in 
attending  to  various  modifications  in  a  timely  fashion  and  the  reviewers  for  giving 
the  feedback  within  a  short  time.  Last  but  not  least,  I  thank  the  editorial  staff  of  the 
Indian  Academy  of  Sciences  for  their  help  in  bringing  out  the  book. 


June  1992 


R  K  Shyamasundar 
Guest  Editor 
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Modelling  real-time  systems:  Issues  and  challenges 
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Abstract.  In  this  paper,  we  discuss  the  issues  and  challenges  that  lie  in 
the  specification,  development,  and  verification  of  real-time  systems.  In 
our  presentation,  we  emphasize  on  the  issues  underlying  modelling  of 
real-time  distributed  concurrency. 

Keywords.  Real-time;  reactive  systems;  concurrency;  bisimulation;  trace 
equivalence;  scheduling. 


1.  Introduction 

Real-time  systems  are  designed  to  cater  to  many  applications  ranging  from  home 
appliances  or  laboratory  instruments  to  process  control  systems,  flexible  manu¬ 
facturing,  flight  control  and  tactical  control  in  military  applications.  Flexible 
manufacturing  is  a  special  kind  of  real-time  application  where  the  behaviour  of  each 
manufacturing  machine  can  be  adapted  instantaneously  to  continuously  changing 
working  conditions  while  still  satisfying  a  global  optimality  criterion.  In  flight  control 
systems  real-time  automatic  manoeuvering  is  used  to  achieve  significant  reduction  of 
fuel  consumption  and  also  for  tactical  control  over  the  target.  In  these  systems,  the 
timely  execution  of  requests  and  responses  by  the  computers  is  critical  to  the  successful 
operation  of  both  the  physical  systems  and  the  computer  itself.  That  is,  in  addition 
to  the  normal  functional  requirements,  it  is  necessary  that  responses  to  inputs  (from 
the  environment)  must  happen  in  a  given  interval  of  time.  We  refer  to  these  systems 
as  real-time  systems  and  the  specified  intervals  of  time  as  deadlines.  We  use  the 
qualification  reactive  to  refer  to  the  fact  that  the  system  has  to  respond  to  environment 
stimuli  continuously.  In  such  systems  one  can  distinguish  two  kinds  of  deadlines. 

•  Hard  deadlines:  Here,  it  is  important  that  the  deadline  must  be  met;  otherwise  the 
result  is  useless;  in  other  words,  what  is  needed  is  the  right  output  at  the  right  time. 

•  Soft  deadlines:  In  these  deadlines,  not  meeting  the  deadlines  results  in  the 
degradation  of  the  system  performance. 

One  of  the  common  concepts  that  counter  a  majority  of  the  process  control  systems 
is  that  of  providing  continual  feedback  to  an  unintelligent  environment.  The  continual 
demands  of  an  unintelligent  environment  cause  these  systems  to  have  relatively  rigid 
and  urgent  performance  requirements,  such  as  real-time  response  requirements  and 
fail-safe  reliability  requirements.  It  seems  that  this  emphasis  on  performance 
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requirements  is  what  really  characterizes  time-critical  systems,  and  causes  us  to  be 
more  aware  of  their  roles  in  their  environments  than  we  are  for  other  types  of  systems. 
The  interface  between  a  process  control  system  and  its  environment  tends  to  be 
complex,  asynchronous,  highly  parallel  and  distributed.  This  is  another  direct  result 
of  the  process  control  concept,  because  the  environment  is  likely  to  consist  of  a  number 
of  objects  which  interact  with  the  system  and  each  other  asynchronously  in  a  parallel 
fashion.  Furthermore,  it  is  probably  the  complexity  of  the  environment  that 
necessitates  computer  support  in  the  first  place.  This  characteristic  makes  the 
requirements  difficult  to  specify  in  a  way  that  is  both  precise  and  comprehensible. 
Finally,  embedded  systems  can  be  extraordinarily  hard  to  test.  The  complexity  of  the 
system/environment  interface  is  one  obstacle,  and  the  fact  that  these  programs  often 
cannot  be  tested  in  their  operational  environments  is  another.  It  is  not  feasible  to 
test  flight-guidance  software  by  flying  with  it,  nor  to  test  ballistic-missile-defence 
software  under  battle  conditions.  Further,  embedded  systems  are  especially  likely  to 
have  stringent  resource  requirements.  These  are  requirements  on  the  resources,  mainly 
physical  in  this  case,  from  which  the  system  is  constructed.  This  is  because  embedded 
systems  are  often  installed  in  places  (such  as  satellites)  where  the  weight,  volume  or 
power  consumption  must  be  limited,  or  where  temperature,  humidity,  pressure  and 
other  factors  cannot  be  as  carefully  controlled  as  in  the  traditional  machine  room.  It 
is  important  to  note  that  a  failure  quite  often  results  in  economic,  human  and  ecological 
catastrophes.  Thus,  safety  and  reliability  are  extremely  important  for  time-critical 
process  control  systems.  Various  parameters  one  has  to  cope  up  with  in  building 
such  systems  can  be  seen  from  some  of  the  main  characteristics  of  real-time  systems 
given  below. 

(a)  The  system  tends  to  be  large,  complex  and  can  be  extraordinarily  hard  to  test. 

(b)  The  environment  that  the  system  interacts  with  is  nondeterministic.  That  is,  most 
of  the  times,  there  is  no  way  to  anticipate  in  advance  the  precise  order  of  external 
events. 

(c)  High  speed  external  events  (perhaps  in  parallel),  must  be  able  to  affect  the  flow 
of  control  in  the  system  easily. 

(d)  The  requests  must  be  responded  and  handled  within  certain  bounded  time  limits. 

(e)  The  system  is  a  coordinated  set  of  asynchronous  distributed  units. 

(f)  The  mission  time  is  long.  The  system  not  only  must  deal  with  ordinary  situations 
but  also  must  be  able  to  recover  from  some  extraordinary  ones. 

It  must  be  quite  evident  from  the  above  characteristics  that  the  design  of  complex 
real-time  systems  poses  a  serious  challenge  since  many  of  the  requirements  and 
restrictions  are  often  conflicting  with  one  another.  Thus,  one  of  the  most  important 
needs  is  to  design  sound  methodologies  for  the  specification,  verification  and 
development  of  real-time  systems  that  would  support  the  common  requirements  of 
flexibility  and  predictability  of  systems.  This  would  certainly  go  a  long  way  in  bridging 
the  thin  line  between  acceptable  and  unacceptable  systems. 

In  this  paper,  we  discuss  the  issues  and  challenges  that  lie  in  the  specification, 
development,  and  verification  of  real-time  systems  with  an  emphasis  on  the  modelling 
of  distributed  real-time  concurrency.  The  rest  of  the  paper  is  organized  as  follows:  §2 
discusses  aspects  of  real-time  systems  that  make  it  different  from  other  systems  and 
the  notion  of  time;  §3  surveys  the  issues  of  modelling  real-time  reactive  systems  in 
some  detail  as  the  study  provides  a  basis  for  observation-based  specifications.  The 
challenges  in  the  design  of  real-time  systems  are  highlighted  in  §4  followed  by  a 
discussion  in  §  5. 
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2.  Characteristics  of  real-time  systems 

In  this  section,  we  discuss  the  need  of  explicit  time,  the  difference  between  real-time 
and  traditional  systems  and  the  problem  of  real-time  system  design. 

2.1  What  is  the  purpose  of  explicit  notion  of  timel 

Traditional  programs  describe  transformations  that  change  values  of  variables  in 
discrete  steps.  Any  processor  implementing  these  transformations  takes  a  finite  amount 
of  time.  In  the  interest  of  generality,  programs  are  usually  designed  such  that  the 
computed  results  are  independent  of  the  execution  speed  of  their  processor(s).  In 
other  words,  time  considerations  are  completely  irrelevant  for  the  functional  behavour 
of  programs  and  their  correctness;  perhaps  it  is  only  relevant  for  questions  of  schedule 
and  efficiency. 

To  avoid  the  need  to  cope  with  explicit  time  considerations  even  in  the  case  of 
concurrent  programming,  a  common  agreement  has  been  evolved  to  use  the  concept 
of  nondeterminism  to  abstract  from  concrete  time  to  handle  classes  of  processes 
working  with  different  relative  speeds.  Such  an  approach  helps  to  avoid  harmful 
comparisons  of  execution  times  and  thus,  provides  highly  abstract  semantic  models 
for  non-sequential  programs.  The  only  indispensable  assumption  we  need  is  that  the 
processor  have  non-zero  finite  speed.  Adherence  to  execution-time  independence 
affords  the  tremendous  advantage  that  a  program’s  validity  can  be  deduced  solely 
from  the  static  program  text  containing  logical  assertions  on  the  state  of  the 
computations  after  each  statement  and  signal  exchange.  If  we  depart  from  this  rule 
and  let  our  program’s  validity  depend  on  the  execution  speed  of  the  utilized  processors, 
we  enter  the  area  commonly  called  real-time  programming  (Wirth  1977).  There  are 
two  main  reasons  for  designing  time-dependent  programs, 

(i)  One  of  the  principal  reasons  for  consideration  of  execution-time  dependent 
programs  in  the  case  of  concurrent  programming  systems  is  that  certain  processes 
are  not  programmable  at  discretion,  as  they  may  be  part  of  the  environment;  this 
leads  to  situations  wherein  processes  fail  to  wait  for  synchronization  signals  indicating 
completeness  of  the  cooperating  partner’s  task.  As  a  result,  cooperation  with  such 
processes  will  necessarily  have  to  depend  on  processor  speed. 

(ii)  The  other  important  reason  for  considering  time  explicitly  is  the  case  of  reactive 
systems  that  model  some  physical  process;  here,  the  internal  laws  which  define  the 
natural  behaviour  of  the  physical  process  are  functions  of  a  parameter  referred  to  as 
physical  time.  The  need  for  reference  to  absolute  time  (or  clock)  for  these  classes  is 
obvious. 

In  the  rest  of  the  paper,  we  essentially  concern  ourselves  with  the  latter  category. 

2.2  What  are  real-time  systems 

There  have  been  several  dichotomies  of  systems  such  as  deterministic/nondeter- 
ministic,  synchronous/asynchronous,  off-line/on-hne,  virtual  time/real-time,  sequential/ 
concurrent  etc.  However,  from  the  point  of  view  of  the  basic  philosophy  of  design, 
we  can  conveniently  distinguish  three  categories  of  systems  depending  on  the  way 
the  systems  interact  with  their  environment. 

(a)  Transformational:  get  the  input  in  the  beginning  and  send  the  output  at  the  end. 

(b)  Interactive  systems:  interact  at  their  own  speed  with  users  or  with  other  systems. 
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(c)  Reactive  systems:  maintain  a  continuous  interaction  with  their  environment,  but 
at  a  speed  which  is  determined  by  the  environment  (and  not  by  the  program  or  the 
system).  In  other  words,  the  output  may  affect  future  inputs  due  to  feedback.  In 
general,  we  can  further  categorize  these  systems  depending  on  the  need  for  absolute 
time.  However,  in  the  sequel  we  will  make  the  distinction  explicit  when  required. 

From  the  design  point  of  view  (a)  and  (b)  have  almost  identical  characteristics  in  the 
sense,  these  systems  can  be  characterized  by  functions.  However,  the  same  is  not  true 
of  reactive  systems.  A  reactive  system,  in  general,  does  not  compute  or  perform 
functions  (Harel  &  Pnueli  1985)  but  is  supposed  to  maintain  a  certain  ongoing 
relationship  with  its  environment. 

The  dichotomies  mentioned  earlier  are  equally  applicable  to  these  systems. 
However,  what  we  are  interested  in  mainly  is  to  see  how  the  methods  of  design  of 
traditional  systems  are  not  amenable  for  the  design  of  reactive  systems.  For  this 
purpose,  let  us  look  at  some  of  the  issues  one  is  faced  with  in  the  development  of  a 
complex  system. 

2.3  How  to  design  a  reactive  system ? 

The  main  issues  one  addresses  in  the  development  of  a  complex  system  can  be  broadly 
categorized  as  follows. 

(a)  The  need  for  separation  of  concerns:  that  is,  the  question  is,  how  does  one 
decompose  the  behaviour  of  the  system?  This  is  an  important  issue  as  it  provides  a 
basis  for  system  design. 

(b)  Refinement  of  behavioural  components  of  the  systems. 

(c)  Interaction  of  the  behavioural  components. 

In  the  case  of  reactive  systems,  most  of  the  times  it  is  even  difficult  to  provide  some 
behavioural  decomposition  even  if  one  ignores  the  necessity  of  the  decomposition 
forming  the  basis  for  system  design.  In  other  words,  the  separation  of  concerns  turns 
out  to  be  extremely  difficult.  For  example,  even  small  real-time  systems  such  as  tactical 
embedded  system  for  an  aircraft  might  be  simultaneously  maintaining  a  radar  display, 
calculating  weapon  trajectories,  performing  navigation  functions  etc.  In  these  kinds 
of  systems,  one  sees  that  (1)  the  code  implementing  the  various  tasks  is  mixed  together 
such  that  it  is  difficult  to  determine  which  task(s)  a  given  part  of  the  code  performs, 
and  (2)  the  timing  dependencies  between  code  sections  are  such  that  changing  the 
timing  characteristics  of  one  section  may  affect  whether  or  not  many  otherwise 
unrelated  tasks  meet  their  deadlines.  Thus,  one  of  the  challenges  is  to  provide  a 
method  for  behavioural  description  such  that: 

(i)  behavioural  description  leads  to  separation  of  concerns; 

(ii)  the  behavioural  description  captures  the  what  part  effectively  in  a  compositional 
(incremental)  way. 

In  the  case  of  transformational  systems,  it  is  possible  to  decompose  the  system  in 
a  way  reflecting  the  natural  structure  of  the  problem.  However,  such  a  decomposition 
is  almost  impossible  since  the  interface  between  the  system  and  the  environment  is 
complex,  asynchronous,  nondeterministic,  highly  parallel  and  distributed.  In  other 
words,  the  behavioral  description  is  the  main  issue  that  makes  it  quite  difficult  from 
the  traditional  systems.  Thus,  one  of  the  immediate  needs  is  to  come  up  with  a 
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compositional  (or  modular)  behavioural  description  of  the  real-time  systems.  This 
would  provide  a  basis  on  which  a  sound  methodology  for  real-time  programming 
can  be  built. 


3.  Issues  of  modelling  real-time  reactive  systems 

As  discussed  already,  a  reactive  system  maintains  continuous  interaction  with  the 
environment  and  maintains  a  certain  ongoing  relationship  with  the  environment.  The 
three  parameters  that  play  a  vital  role  in  the  modelling  of  real-time  reactive  systems  are: 

•  communication  mechanism; 

•  environmental  abstraction; 

•  real-time. 

Modelling  of  environmental  abstraction  is  dependent  on  the  actions  that  can  be 
observed.  In  other  words,  models  of  concurrency  for  distributed  systems  provide  a 
nice  basis  for  environmental  abstraction  for  reactive  systems  without  the  explicit 
notion  of  real-time.  Thus,  the  issues  of  modelling  real-time  reactive  systems  can  be 
broadly  categorized  into: 

•  models  of  communication; 

•  models  of  reactive  systems; 

•  real-time  and  concurrency. 

3.1  Models  of  communication 

There  are  various  ways  of  transmission  from  one  task  to  another.  Normally,  in  any 
mode  of  communication,  one  can  identify  three  entities:  sender,  receiver  and  medium. 
The  first  two  are  active  processes  whereas  the  third  is  passive  and  it  denotes  the  form 
of  information  in  transit.  A  spectrum  of  media  can  be  obtained  based  on  various 
parameters  such  as: 

•  whether  the  sender  can  always  send  a  message  or  can  be  blocked; 

•  the  receiver  may  receive  a  message  provided  the  medium  is  not  empty; 

•  whether  there  is  any  constraint  on  the  number  of  messages; 

•  whether  the  transmission  is  one-to-one  or  one-to-many  etc; 

•  whether  the  order  of  messages  received  is  the  same  as  the  order  of  messages  sent  etc. 

It  must  be  quite  clear  that  a  careful  treatment  of  the  media  is  essential  for  realizing 
a  general  model.  Based  on  the  various  hardware  architectures,  one  can  obtain  a 
variety  of  models.  Some  of  the  prominent  ones  are  discussed  below. 

(1)  The  shared  memory  discipline  broadly  follows  the  following  discipline: 

•  the  sender  may  always  write  an  item  to  a  register  or  a  location  (of  course,  one 
assumes  that  the  access  of  a  register  is  mutually  exclusive); 

•  the  receiver  may  read  an  item  from  a  register  (or  a  location); 

•  reading  and  writing  may  occur  in  any  order. 

(2)  One  can  treat  the  act  of  sending  and  receiving  as  one  single  indivisible  act  of 
communication;  in  other  words,  it  can  be  treated  as  a  point-to-point  action.  This  is 
often  referred  to  as  synchronous  or  handshake.  Such  a  discipline  unifies  the  three 
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entities  in  the  sense  that  sender  and  receiver  participate  in  indivisible  acts  of 
communication  experienced  simultaneously  by  the  sender  and  receiver.  A  further 
refinement  can  be  obtained  by  defining  whether  the  two  participating  agents  exchange 
information  with  each  other  or  the  information  flows  in  an  unidirectional  way.  The 
equations  for  handshake  communication  will  be  discussed  in  the  coming  sections 
where  the  net  effect  of  communication  is  replaced  by  a  single  non-observable  action, 
denoted  by  x  referred  to  as  the  silent  action  or  the  perfect  action.  In  fact,  different 
models  can  be  obtained  by  considering  details  such  as  whether  the  channels  can  be 
shared  or  the  complimentary  actions  can  be  mapped  to  some  other  alphabet.  Most 
of  the  formalisms  of  CSP  (Hoare  1988),  and  CCS  (Milner  1980),  are  built  on  these  models. 
(3)  Another  discipline,  referred  to  as  the  asynchronous  discipline,  has  the  following 
characteristics: 

•  the  sender  is  not  blocked; 

•  the  receiver  can  receive  a  message  provided  the  medium  is  nonempty. 

The  asynchronous  discipline  can  be  further  refined  by: 

•  allowing  simultaneous  message  transmission  to  various  agents  rather  than  point-to- 
point;  this  is  often  referred  to  as  broadcast  transmission. 

•  many  agents  can  combine  and  exchange  information  among  themselves;  this  is 
referred  to  as  multicast. 

A  formal  analysis  of  the  broadcast  mechanisms  has  been  detailed  in  Shyamasundar 
et  al  (1987);  the  multicast  paradigms  can  be  seen  in  the  exchange  functions  treated  in 
Fitzwater  &  Zave  (1977)  and  also  in  Shyamasundar  et  al  (1991)  in  the  context  of 
formalizing  coordinating  actions. 

3.2  Models  of  reactive  systems 

To  make  the  discussion  concrete,  we  take  a  very  simple  scheme  for  specifying  behaviour 
of  reactive  systems  without  an  explicit  clock.  A  reactive  system  has  an  alphabet  of 
events,  corresponding  to  the  set  of  possible  observable1  events  in  which  the  processes 
may  be  involved.  The  events  require  the  participation  of  both  the  system  and  its 
environment  which  can  be  taken  to  be  a  user/another  sub-component  of  a  larger 
system.  Following  the  notation  of  Milner  (1980)  and  Hoare  (1988),  any  reactive  system 
is  a  well-formed  syntactic  term  involving  the  events  and  the  two  operators  +  and  ||. 
More  precisely,  given  an  event  alphabet  E,  reactive  processes  are  defined  as  follows. 

(1)  nil  is  a  process  which  performs  no  actions. 

(2)  If  p  is  a  process  then,  for  each  esE,  e.p  is  a  process  which  performs  first  an  e  and 
then  behaves  like  p. 

(3)  If  p,q  are  processes  then  p  +  q  and  p\\q  are  processes. 

The  operators  -f  and  ||  denote  respectively  the  nondeterministic  choice  and  the 
parallel  composition  operator.  The  operational  semantics  of  the  above  language  is 
given  in  terms  of  a  labelled  transition  system. 

A  labelled  transition  system  is  given  by  ( Proc ,  Act,  ->)  where  Proc  is  the  set  of 


^he  role  of  observation  in  the  modelling  of  real-time  reactive  systems  is  discussed  while 
relating  the  models  of  reactive  systems  with  time. 
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process  states.  Act  is  the  set  of  actions  (or  events)  and  ->  £  Proc  x  Act  x  Proc.p^aq 
is  interpreted  as  follows:  P  is  observed  to  do  a,  and  changes  to  become  q. 

In  sequential  programming  languages,  there  is  a  clear-cut  notion  of  what  the 
observable  behaviour  of  a  program  is:  an  mput/output  pair  leading  to  the  semantic 
description  of  a  program  as  a  function  or  in  the  nondeterministic  case,  a  relation  or 
multifunction.  However,  for  concurrent/reactive  languages,  there  is  no  single  canonical 
notion  of  observable  behaviour  but  rather  a  multiplicity  of  semantic  models.  In 
particular,  each  notion  of  observability  yields  an  answer  to  the  following  basic  question 
which  any  semantics  should  address. 

When  do  two  expressions  have  the  same  meaning?  That  is, 

observable  property  A.psatAoqsatA. 

In  other  words,  two  processes  are  equivalent  (hence,  have  equal  meanings)  when  they 
admit  the  same  observations.  By  varying  the  notion  of  observable  property  of  behaviour, 
we  also  vary  the  associated  equivalence;  the  more  we  can  observe,  the  more  chances 
we  have  of  distinguishing  processes  and,  hence,  fewer  processes  are  made  equivalent 
(equivalence  is  finer).  Now,  the  transitions  for  the  language  defined  above  is  given 
below:  Let  -+  be  defined  as  the  least  relation  satisfying  the  following  axioms: 

ap^p 

P^P' 
p  +  q^p' 

p  +  q^q' 

p^p'^p\\q^p'\\q 

qZ+q'^pWqZ+pWq' 

With  the  above  language  and  the  transition  system,  let  us  consider  the  various  classes 
of  observable  behaviours  that  lead  to  different  equivalences.  Some  of  the  presentation 
is  based  on  the  unpublished  lectures  by  Abramsky  (1989)  and  by  Groote  (1988). 

3.2a  Traces:  Here,  only  sequences  of  actions  can  be  observed;  this  is  the  simplest 
notion  leading  to  the  coarsest  (reasonable)  equivalence.  Define, 

p^q,s  =  a1,...,an(aieAct*)  iff  3 pl,...,p„-1  hp~*Pi  -+ - +q, 

S 

Traces(p )  -  {seAct*\3q,p-^-q}.  Then,  the  equivalence  is  defined  by 

p~Tqo  Traces (p)  =  Traces(q). 

The  above  equivalence  is  too  coarse  as  it  ignores  the  differences  in  deadlock  behaviour. 
This  is  illustrated  in  the  classical  example  shown  in  figure  1. 

In  the  above  example,  Traces(p)=  Traces(q)  =  {e,a,ab,ac}.  It  may  be  observed 
that  after  doing  a,  p  can  always  do  b,  while  q  can  sometimes  refuse  to  do  b. 


Figure  1.  Trace  equivalence. 
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q: 


a 

V 

b 

v  Figure  2.  Complete  trace  equivalence. 


3.2b  Maximal  ( complete )  traces:  Consider  the  processes  shown  in  figure  2. 

It  can  be  easily  seen  that  p  ~  Tq.  However,  it  can  be  seen  that  the  processes  can  in 
fact  be  distinguished  by  observing  that  process  p  has  a  trace  a  from  which  there  is 
no  further  action  and  process  q  does  not  have  such  a  trace.  Refinements  corresponding 
to  such  experiments  can  be  obtained  by  adding  the  capability  to  observe  inactions. 
Maximal  (complete)  traces  of  processes  lead  to  such  refinements. 

Let  p-f*  denote  the  fact  that  there  does  not  exist  any  a  and  p'  such  that  p-*-ap'. 
Then,  the  maximal  traces2  (MT  for  short)  are  defined  by, 

MT(p)  =  {c\oeA*,3q,p->aq-/+}. 

Then  the  equivalence  is  defined  by  p  ~MTg<s>MT(p)  =  MTfg). 

3.2c  Failure  sets:  If  seFT(p)  can  be  interpreted  as:  after  process  p  does  s,  it  refuses 
to  do  any  action.  A  refinement  of  such  a  notion  can  be  obtained  by  observing  a 
process  refusing  a  subset  of  actions  offered  by  the  environment.  The  equivalences 
that  can  be  obtained  with  such  an  observable  capacity  are  formally  defined  below. 

Let  (s,3^)eAct*  x  ZP(Act).  Define,  F(p)  —  {(s,2F)\3q, p^sq  A  Vae3C,q-f>a).  Then  the 
equivalence  is  defined  by  p  ~  FqoF{p)  —  F(q) 

Consider  processes  p  and  q  shown  in  figure  3. 

It  can  be  easily  seen  that  MT(p)  =  MT(g);  however,  F(p)  ^  F(q). 


Figure  3.  Failure  set  equivalence. 

3. 2d  Failure  traces:  A  refinement  of  the  failure  equivalence  can  be  obtained  by 
considering  failure  sets  at  all  the  intermediate  points  of  the  traces.  Failure  traces 
(denoted  FT)  is  defined  by, 

FT(P)  =  {(a0,2F  0)(a1,2F  i)  STn)|  3p 

=  Po,"-Pn,Pi-*aiPi+i^ae^i,pi-/*a,0^i^n}. 

Then  the  equivalence  is  defined  by, 

P  ~  ft g^FTfp)  =  FT(<?). 

Consider  the  two  processes  shown  in  figure  4. 

The  two  processes  are  not  distinguishable  under  failure  sets  whereas  they  are 
distinguishable  under  failure  traces.  The  latter  follows  from  the  fact  that  (a,  {b})cee 
FT  (p)  -  FT  (q). 


2  This  is  useful  for  terminating  systems. 
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Figure  4.  Failure  trace  equivalence. 


3.2e  Ready  sets :  If  instead  of  observing  what  actions  cannot  be  done,  the  enablement 
or  the  readiness  of  actions  can  be  observed  then  one  gets  another  notion  of  ready 
equivalence.  Surprisingly,  it  turns  out  that  this  notion  refines  that  of  failure  set 
equivalence.  Define, 

R(p )  =  {(s,3f)\3q-p-*sq  A  MaeAct, q-+aoae%). 

Then  the  equivalence  is  defined  by, 
p~RqoR(p)  =  R(q). 

Consider  the  processes  shown  in  figure  5a. 

It  can  be  seen  that  the  two  processes  shown  in  figure  5a  are  not  distinguishable 
under  failure  sets  whereas  they  are  distinguishable  under  ready  sets. 

Note.  It  is  of  interest  to  note  that  from  the  set  of  ready  pairs  of  a  process  graph3, 
the  set  of  failure  pairs  can  be  deduced  but  not  the  other  way  round. 

Consider  the  processes  shown  in  figure  5b.  It  can  be  seen  that  p~Rq;  however, 
p  ^FTq.  Thus,  from  figures  5a  and  b,  the  incompatibility  of  failure  traces  and  ready 
sets  follows. 


P'- 


(a) 


( b) 

Figure  5.  (a)  Ready  set  equivalence,  (b)  Ready  set  and  failure  trace  equivalence. 


3  In  this  paper,  we  have  used  these  terminologies  in  an  informal  sense;  for  a  formal  definition, 
the  reader  is  referred  to  Baeten  &  Weijland  (1990). 
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3.2f  Ready  traces:  The  equivalence  can  be  further  refined  by  considering  the  ready 
sets  at  all  the  intermediate  points.  Ready  traces  (denoted  RT)  are  defined  by,. 

RT(P)  =  {(«!,. ^x)(a2,3C2)- ~(an,  &„)\Bp 

=  Po>  ■••pn,pi-+aiPi+1,Pi-+aoae%'i}. 

Then  the  equivalence  is  defined  by, 

P~rt9^>RT(p)=  RT(<?)- 

Consider  the  processes  p  and  q  as  shown  in  figure  4.  It  can  be  easily  seen  that 
p~Rq  and  p^RJq.  The  latter  follows  by  looking  at  all  the  ready  sets  at  the  various 
points  of  the  process  graphs.  Now.  consider  the  processes  shown  in  figure  5a  from 
the  point  of  view  of  ready  and  failure  traces. 

•  FT(p)  =  clos  { (a,  { b ,  c}  ){b,  {c} ),  (a,  {b,  c} )(c,  {b} ) }; 

FT (q)  =  clos{(a,{b,c}){b,{c}),{a,{b,c})(c,{b})};  hence,  FT(p)  =  F7(q),  where  clos 
gives  the  prefix-closure  of  the  set  with  respect  to  the  first  element  and  any  subset 
of  the  failure  sets  of  the  elements  of  the  sequence. 

•  However,  RT(p)  #  RT(q),  since 
(a,{a}){b,{b,c})eR7{q) 

GRT(p). 

3.2g  Simulation  and  bisimulation:  In  a  sense,  all  the  above  equivalences  are  some 
refinements  of  traces  or  execution  sequences.  In  the  following,  we  define  some  notions 
that  are  based  on  the  notions  of  the  execution  trees  induced  by  the  processes. 

A  natural  notion  of  process  equivalence  can  be  obtained  by  formally  interpreting 
a  labelled  directed  graph  as  a  process;  we  refer  to  such  graphs  as  process  graphs. 

A  simulation  of  process  graph  Gt  (say,  by  process  graph4  G2  (say, 

<  V  2 ,  E2  ) )  is  a  binary  relation,  01  between  their  nodes  satisfying  the  following  condition 
(->“  can  be  treated  as  a  reachability  relation): 

(i)  Fj  <=  Dom(0). 

(ii)  (p,q)e0  zd  Va,  p->flp'  in  Gx  zd  q->aq'  in  G2,  A  {p',q')e0. 

Two  graphs  are  simulation  equivalent  {denoted  ~s)  if  there  exists  simulations  in  both 
directions. 

It  may  be  noted  that  R-1  need  not  necessarily  be  a  simulation.  If  R_1  is  also  a 
simulation  then  one  gets  bisimulation  equivalence.  This  is  the  finest  reasonable 
equivalence  which  is  not  based  on  the  traces  or  execution  sequences.  This  is  formalized 
below. 

In  a  sense,  bisimulation  is  the  finest  reasonable  equivalence  (could  be  considered 
as  single  step  true  concurrency)  based  on  the  notions  due  to  D  Park  and  R  Milner. 
Intuitively,  the  equivalence  corresponds  to  comparing  states  for  equivalence  recursively 
by  the  condition  that  every  action  of  P  has  a  matching  action  of  q  leading  to  an 
equivalence  state  and  vice  versa.  Formally,  it  can  be  defined  as  follows, 

p~B^<=>Va[Vp', p->ap'=>3<7', q-+aq'  A  p'  ~Bq' 

A  V<?',  q-+aq'^>  3p',  p  -+ap'  A  p'  ~  Bq'~\. 

In  other  words,  bisimulation  identifies  processes  just  when  they  unfold  into  the  same 
(unordered)  labelled  trees. 


4 Here,  V1  and  E2  respectively  denote  the  set  of  nodes  and  edges  of  G,. 
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p: 


Figure  6.  Bisimulation  and  ready 
trace  equivalence. 


Example.  a{b  +  c)fBab  +  ac  since  b  +  cfBb  and  b  +  cfBc. 

The  concept  of  bisimulation  can  also  be  captured  in  terms  of  equations  over  sets 
as  follows. 

Define  Bis(p)—  {(a,  Bis(q)}\p->aq}, 

Such  a  solution  always  exists  if  one  assumes  Aczel’s  anti-foundation  axiom.  That  is, 
Bis{p)  may  become  a  non  well-founded  set.  The  distinguishability  of  processes 
discussed  above  can  be  captured  in  terms  of  the  following  sets: 

<«,  { <f>.  <i>,  { <c,  4>»  >}  }eBis(p) 

The  example  shown  in  figure  6  illustrates  that  bisimulation  refines  ready  traces. 

A  further  refinement  of  equivalences  lying  between  simulation  and  bisimulation 
can  be  obtained  by  refining  the  simulation  equivalence.  Two  of  such  refinements  are 
defined  below. 

A  2-nested  simulation  is  a  simulation  with  the  property  that  related  nodes  are 
simulation  equivalent.  In  other  words,  there  also  exists  a  simulation  in  the  reverse 
direction  between  the  subgraphs  that  are  rooted  by  related  nodes. 

Two  graphs  are  2-nested  simulation  (denoted  ~2n_s)  equivalent  if  there  exists 
2-nested  simulation  in  both  directions. 

For  the  processes  shown  in  figure  2,  we  have  p~sq  and  p  fUJ q.  From  this  and 
the  fact  that  trace  equivalence  is  contained  in  simulation  and  maximal  trace 
equivalences,  we  can  conclude  the  incompatibility  of  the  two.  The  incompatibility  of 
the  ready  trace  and  simulation  follows  from  the  example  shown  in  figure  7a. 

A  ready  simulation  is  a  simulation  such  that  related  nodes  have  the  same  set  of 
initial  actions.  The  underlying  equivalence  is  referred  to  as  ready  equivalent  and 
denoted  by  ~RS. 

The  examples  shown  in  figures  7b  and  c  differentiate  the  equivalences  due  to 
simulation,  2-nested  simulation  and  bisimulation. 

3.3  Comparison  of  the  various  equivalences 

The  following  theorem  (cf.  Baeten  &  Weijland  1990  for  proof)  shows  the  implications 
of  the  various  equivalences  discussed  above. 

Theorem  1.  Let  g  and  h  be  any  two  process  expressions.  Then, 

(1)  if  g  ~Bh  then  g  ~Rh;  (2)  if  g  ~Rh  then  g  ~Fh\  (3)  if  g  ~Fh  then  g  ~  Th;  (4)  if  g  ~Bh 
then  g  ~RTh\  (5)  if  g  ~RTh  then  g  ~FT/z;  (6)  if  g  ~RTh  then  g  ~Rh;  (7)  if  g  ~FTh  then 
g~Fh. 

Note.  It  may  be  noted  that  traces  take  account  of  the  intermediate  state  in  a  very 
weak  way  whereas  bisimulation  does  so  in  a  very  strong  way. 
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(C) 

Figure  7.  (a)  Ready  trace  and  simulation  equivalence,  (b)  Bisimulation  and 
2-nested  simulation,  (c)  Ready  simulation  and  2-nested  simulation  equivalence. 


The  results  of  the  above  theorem  and  the  various  incompatibility  relations  among 
the  various  equivalence  notions  illustrated  earlier  is  nicely  captured  in  the  semantic 
lattice  shown  in  figure  8. 

We  can  summarize  the  various  observational  characteristics  that  make  the  various 
equivalences  finer  than  the  other  equivalences  as  follows: 

•  observability  of  inaction  refines  maximal  traces  over  that  of  traces; 

•  observability  of  blocking  refines  failure  sets  over  that  of  maximal  traces; 

•  if  the  observability  of  blocking  is  made  dynamic  then  we  get  the  failure  traces; 

•  observability  of  the  actions  that  a  process  can  make  gives  the  power  to  ready  sets; 
the  dynamisation  of  the  ready  actions  leads  to  ready  traces; 

•  on  the  other  hand,  giving  the  power  of  copying  leads  to  simulation  and  the  power 
of  global  testing  leads  to  bisimulation  equivalence. 

3.4  Observational  and  bisimulation  equivalence 

Now  that  we  know  that  bisimulation  is  strong  and  very  nice  from  the  point  of  view 
of  equivalences,  let  us  look  at  bisimulation  from  a  computational  point  of  view.  An 
immediate  question  that  arises  is: 

Is  bisimulation  based  on  a  reasonable  notion  of  observable  behaviour?  or  is  it  too 
fine? 
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Figure  8.  Relationship  among  various  equivalences. 

In  other  words,  is  there  any  way  one  could  observe  all  the  distinctions  it  makes  by 
performing  experiments?  For  example,  consider  the  processes  shown  in  figure  9. 

Obviously,  p  f Bq .  However,  by  performing  experiments  on  the  observable  actions 
there  is  no  way  the  two  can  be  distinguished. 

The  question  of  looking  at  bisimulation  from  the  point  of  view  of  the  underlying 
traces  has  been  addressed  in  Bloom  et  al  (1988,  pp.  229-239).  They  argue  that  the 
notion  of  trace  congruence  cannot  be  captured  as  a  trace  congruence  of  any 
“reasonable”  process  constructions.  Larsen  &  Skou  (1989)  have  defined  the  notion 
of  probabilistic  bisimulation  to  tackle  the  argument  against  bisimulation  given  in 
Bloom  et  al  (1988,  pp.  229-239).  Groote  &  Vaandrager  (1989,  pp.  423-438)  have 
studied  the  relation  between  bisimulation  and  structured  operational  semantics  as 
well  as  the  property  of  full  abstractness.  An  attempt  towards  an  unification  of  the 
frameworks  has  been  envisaged  in  Abramsky  &  Vickers  (1991). 

3.5  Other  factors  related  to  models  of  concurrency 

3.5a  Treatment  of  silent  actions:  In  the  transition  system  given  above,  we  have  not 
said  anything  about  the  type  of  communication.  A  model  of  the  simple  synchronous 


p: 


nO 

B 


Figure  9.  Bisimulation  and  obser¬ 
vational  equivalence. 
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Figure  10.  Effect  of  t  actions  on 
equivalences. 


communication  can  be  obtained  by  adding  the  following  axiom  for  our  earlier  transition 
system. 


Here,  r  is  referred  to  as  the  silent  or  the  perfect  action  (Milner  1988).  In  a  sense,  we 
can  now  ask  the  question:  to  what  extent  are  the  silent  steps  observable ?  From  the 
point  of  view  of  observation,  one  can  ask  questions  like:  can  one  observe  silent  actions 
before  or  after  an  observable  event?  For  example,  depending  upon  the  choice  of 
equivalence  or  inequivalence  of  the  processes  shown  in  figure  10,  one  gets  different 
models. 

3.5b  Linear  time  vs  branching  time:  In  the  previous  sections,  we  have  essentially 
considered  two  classes  of  equivalences: 

•  pure  traces  or  refinements  of  traces; 

•  bisimulation. 

The  first  class  can  be  termed  linear  time  equivalences  in  that  a  process  is  determined 
by  its  possible  executions.  The  second  class,  that  is  bisimulation  can  be  termed 
branching  time  equivalence  which  not  only  preserves  the  traces  but  also  the  branching 
structure  of  processes.  One  of  the  most  popular  arguments  in  favour  of  the  branching 
time  semantics  was  the  fact  that  it  allows  a  proper  modelling  of  deadlock  behaviour 
whereas  the  linear  time  does  not.  However,  it  can  be  seen  from  the  various  equivalences 
discussed  that  even  though  this  is  true  for  the  case  considering  pure  traces,  the  same 
comment  does  not  hold  in  the  context  of  ready  or  failure  sets.  In  fact,  an  additional 
advantage  of  the  linear  time  equivalences  discussed  above  is  that  one  also  gets  the 
notion  of  testing  or  observation  for  distinguishing  processes.  The  main  criticism  of 
branching  time  structure  is  that  distinction  between  processes  are  made  that  cannot 
be  observed  or  tested,  unless  observers  are  equipped  with  extraordinary  abilities  like 
copying  or  global  testing  (cf.  Abramsky  1987). 

Even  though  bisimulation  preserves  the  branching  structure  of  the  processes,  an 
anomaly  arises  in  the  context  of  Milner’s  observational  equivalence  as  illustrated 
(from  Van  Glabbeek  &  Weijland  1989)  in  figure  11. 

It  may  be  noted  that  in  figure  1 1  (A),  we  have  a  path  azbxc  with  outgoing  edges 
dj,...,^,  and  it  follows  easily  that  all  the  three  graphs  are  observation  equivalent. 
It  may  be  noted  that  h-edges  (shown  in  broken  lines)  may  be  added  without  disturbing 
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Figure  11.  Observation  equiva¬ 
lence  in  branching  time. 


the  equivalence.  However,  in  both  (B)  and  (C),  a  new  computation  path  is  introduced 
in  which  an  outgoing  edge  d2  (or  d2  respectively)  is  missing;  in  fact  such  a  path  did 
not  occur  in  (A).  In  other  words,  in  the  path  introduced  in  (B)  the  options  and  d2 
are  discarded  simultaneously,  whereas  in  (A)  it  corresponds  to  a  path  containing  a 
state  where  the  option  d{  is  already  discarded  but  d2  is  still  possible.  Further,  in  the 
path  introduced  in  (C)  the  choice  not  to  perform  d3  is  already  made  with  the  execution 
of  h-step,  whereas  in  (A)  it  corresponds  to  a  path  in  which  this  choice  is  made  only 
after  the  fr-step.  From  this  it  follows  that  observation  equivalence  does  not  preserve 
the  branching  structure  of  processes  and  hence  lacks  one  of  the  main  characteristics 
of  bisimulation  semantics. 


3.5c  Interleaving  vs  true  concurrency:  Two  extreme  ways  of  modelling  concurrency 
are: 

•  concurrency  is  nothing  but  nondeterministic  interleaving  of  concurrent  events; 

•  concurrency  is  a  phenomenon  quite  independent  from  nondeterminism. 

Consider  the  process  a.nil\\b.nil,  which  is  specified  to  do  the  actions  a  and  h 
concurrently.  In  the  first  view,  the  trace  semantics  of  this  process  is  given  by 

{c,a.b,  b.a}. 
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Figure  12.  Interleaving  vs.  true  con- 
v  currency. 


Thus,  in  this  model,  this  process  is  identified  with  another  process  a.b.nil  +  b.a.nil, 
which  does  actions  a  and  b  one  after  another  but  nondeterministically  in  either  order. 
A  typical  model,  that  distinguishes  concurrency  from  nondeterminism,  is  the  partial 
order  model.  In  this  model,  the  above  process  is  given  by  the  poset  <  {a,  b},  ^  ),  where 
the  events  a  and  b  are  unrelated  by  the  relation  ^ ;  in  this  model  for  actions  x,y,x^y 
means  that  x  occurs  before  y  and  if  two  actions  are  unrelated,  then  it  is  not  known 
in  which  order  these  actions  take  place. 

The  first  model  reduces  concurrency  to  nondeterminism.  As  a  consequence,  any 
particular  action  in  a  process  may  be  arbitrarily  delayed.  If  other  component 
process  (es)  involves  an  infinite  number  of  actions  then  there  is  no  upper  bound  on 
the  time  within  which  any  action  will  be  executed.  In  the  worst  case,  an  action  may 
ever  be  delayed.  In  the  second  techniques,  we  have  an  extra  “simultaneous”  operator 
and  two  events  will  be  related  with  each  other  if  they  are  causal  with  reference  to 
each  other.  The  following  trivial  example  shown  in  figure  12  illustrates  the  difference 
between  the  two  informally. 

3.5d  Treatment  of  divergence:  Another  crucial  point  lies  in  the  way  infinite  sequences 
of  r-steps  in  a  process  are  treated.  In  the  failure  semantics  proposed  by  Brookes  et  al 
(1984),  all  processes  having  an  infinite  x-sequence  from  the  root  are  set  equal  (to 
process  CHAOS).  For  example,  one  can  generalize  such  notions  as  by  equating  the  following 
two  processes  as  shown  in  figure  13. 

The  notion  of  bisimulation  is  more  discriminating.  The  advantage  is  that  process 
models  obtained  by  bisimulation  equivalence  satisfy  useful  abstraction  principles 
based  on  fairness.  For  example,  Koomen’s  fair  abstraction  rule  gives  a  way  of 
simplifying  processes  by  elimination  of  (some)  infinite  x-sequences.  This  elimination 
can  be  understood  as  fairness  of  (visible)  actions  over  silent  x-steps. 

3.6  Concurrency  and  real-time 

From  the  survey  on  concurrency,  it  must  be  apparent  that  the  concurrency  theory 
can  be  seen  as  an  abstraction  of  observation.  In  a  sense,  for  a  natural  and  a  formal 
abstraction  of  distributed  systems  it  is  necessary  that  theories  must  take  into  account 
the  physical  laws  that  distributed  systems  must  obey.  One  of  the  prime  factors  that 
must  necessarily  be  tackled  is  the  relation  between  logical  time  and  physical  time  for 


Figure  13.  Effect  of  divergence  on  equivalence. 
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the  understanding  of  real-time  distributed  systems.  It  is  a  standard  paradigm  of  physics 
to  understand  the  notion  of  atomicity  of  a  thing  or  explore  the  internal  structure  of 
aspects  previously  considered  as  atomic.  Thus,  an  observation  in  the  context  of 
concurrency  in  the  presence  of  teal-time  necessitates  the  understanding  of  an  event 
and  atomicity.  In  the  following  sections,  we  provide  a  background  on  the  notion  of 
an  event  and  the  notion  of  atomicity  and  discuss  the  various  time  domains  that  have 
been  used  in  the  specification  of  real-time  distributed  systems.  With  such  a  background, 
we  discuss  the  possible  choice  of  the  various  concurrency  models  in  the  context  of 
time  as  an  observable  entity. 

3.6a  Events  and  time  domains:  One  of  the  fundamental  notions  that  needs  a  careful 
examination  is  the  notion  of  an  event.  In  fact,  it  is  from  such  an  analysis  one  can 
capture  the  notion  of  observational  behaviour  for  real-time  reactive  systems.  This 
problem  has  been  nicely  dealt  with  in  Lelann  (1983)  with  respect  to  the  Newtonian 
and  relativistic  notions  of  observation. 

Based  on  the  notions  of  observability  one  of  the  immediate  questions  that  arises 
is:  Is  an  event  atomic ?  If  we  consider  an  event  as  being  something  instantaneous  that 
exists  or  not,  then  the  question  does  not  arise.  Obviously,  an  invariant  universe  would 
not  define  time,  and  would  not  need  it.  It  is  only  because  states  change  that  time 
acquires  meaning.  Thus,  there  is  a  need  to  consider  some  physical  universe,  (JU,  which 
includes  elementary  entities.  Every  entity  will  be  associated  with  a  set  of  states.  An 
entity  can  only  be  in  one  state  at  a  time. 

Without  loss  of  generality,  we  need  only  two  states,  say  true  and  false.  We  will  be 
interested  only  in  the  changes  that  bring  an  entity  from  state  false  to  state  true. 
Reaching  state  true  is  what  constitutes  an  instantiated  event  in  universe  fyl.  By 
definition  of  an  event,  entities  cannot  be  observed  while  states  are  being  changed. 
Consequently,  an  elementary  entity  state  change  [  false  ->  true']  is  the  smallest  atomic 
operation  one  can  conceive  in  °U  (symbol  ->  reads  precedes).  Two  successive  state 
changes  [  false  ->  true]  for  a  given  entity  in  ^  correspond  to  two  infinitesimally  close 
points  in  ^’s  spacetime.  Now,  time  can  only  be  defined  and  instantiated  as  a  change 
of  state  occurring  at  some  location,  e.g.,  raise  of  a  pulse  on  a  wire,  or  a  division  on 
a  clock  face  etc. 

In  trying  to  define  time  for  an  instantiated  event,  we  find  ourselves  back  in 
considering  timeless  events.  We  are  then  forced  to  admit  that  definition  of  an  event 
in  some  real  universe  is  meaningful  only  if  we  assume  that  it  is  possible  to  observe 
concurrent  phenomena  unambiguously  in  this  universe,  or,  in  other  words,  that  the 
ordering  of  the  termination  operations  is  an  invariant  in  this  universe.  One  of  these 
two  operations  is  instantiated  as  a  physical  clock  state  change  denoted  next. 
Assume  that  the  physical  clock  is  in  location  k.  Assume  the  other  operation  takes 
place  in  some  other  location  (. 

A  change  [/a/se(/)->  truest)]  is  said  to  be  an  event  (/,  t)  if 

{ false(tf );  t(k)}  — ►  true{l) ->  t  next(k). 

The  relative  ordering  of  falseitf)  and  f(/c)  does  not  matter.  For  such  an  event  to  be 
observed  unambiguously  in  universe,  it  is  necessary  to  assume  that  every  change 
of  t{k)  is  communicated  instantaneously  to  all  entities  in  °U.  In  practice,  this  entails 
the  following  two  requirements: 

•  t{k)  is  communicated  with  a  delay  that  is  negligible  compared  with  the  interval 

separating  two  consecutive  state  changes; 
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•  t(k)  is  communicated  almost  simultaneously  to  all  entities,  the  time  dispersion  for 
any  two  entities  being  negligible  compared  with  the  interval  separating  two 
consecutive  state  changes. 

Whether  such  requirements  can  be  satisfied  depends  entirely  on  s  spacetime 
topological  properties.  When  we  do  not  know  how  to  achieve  appropriately  timed 
broadcasting  of  a  unique  signal  on  different  physical  paths,  we  are  left  with  the 
problem  of  dealing  with  propagation  delays  that  are  not  negligible  compared  with 
clock  periods  and  that  may  vary  at  different  instants  over  some  given  physical  path. 
These  are  the  conditions  for  adopting  a  relativistic  view  of  time. 

Two  approaches  are  possible,  depending  on  °U' s  spacetime  topological  properties: 

•  Approximate  some  unique  time  dimension  throughout  °U  via  the  definition  of  a 
spacetime-independent  transformation  F;  under  specific  timeless  assumptions 
(e.g.,  existence  of  finite  lower  and  upper  bounds  for  propagation  delays),  one  can 
devise  algorithms  for  which  proofs  establish  that  the  relative  drift5  of  any  two 
clocks  will  never  exceed  some  “acceptable”  value.  We  are  then  dealing  with 
Newtonian  Physics.  Most  real  time  distributed  computing  systems  of  limited  scale 
fall  into  this  category. 

•  Correlate  different  time  dimensions  with  each  other  throughout  °U  via  the  definition 
of  a  spacetime-dependent  transformation  F  (e.g.  Lorentzian  transformation),  when 
specific  timeless  assumptions  cannot  be  made.  For  example,  the  situation  created 
by  a  clock  signal  that  travels  faster  (respectively  slower)  than  expected  can  be 
equated  with  a  situation  where  the  receiver  of  the  signal  (respectively  the  sender) 
is  being  accelerated  relative  to  the  sender  (respectively  the  receiver).  The  same 
situation  arises  when  the  relative  motion  of  the  clocks  or  gravitational  effects  are 
not  negligible,  as  exemplified  by  the  Global  Positioning  System  (Navstar).  We  are 
then  dealing  with  Relativistic  Physics. 

Clearly,  the  problems  that  are  derived  from  the  presentation  above  cannot  be 
avoided  by  utilizing  extremely  accurate  clocks,  as  is  sometimes  believed.  Caesium 
clocks  which  have  a  timekeeping  capability  of  OT  /rs/day  would  be  useful  for  very 
closely  approximating  the  implicit  statement  that  all  clocks  behave  identically  in 
identical  circumstances.  But  even  such  good  clocks  cannot  influence  properties  of 
signal  propagation  delays. 

We  have  assumed  so  far  that  a  state  change  l false  -*■  true]  is  the  smallest  atomic 
operation  one  can  think  of.  But  how  is  atomicity  effectively  obtained?  One  could 
imagine  that  specific  elementary  physical  devices  could  be  built  that  would  implement, 
say  at  the  bit  level,  these  atomic  state  changes.  Unfortunately,  there  are  a  few  problems 
which  prohibit  us  from  assuming  that  such  is  the  case.  For  example,  as  the  levels  of 
energy  and  speed  of  signals  are  never  infinite  in  computing  systems,  state  changes 
are  not  instantaneous.  While  a  state  is  being  changed  (Write)  many  observations 
(Reads)  can  take  place.  These  operations  are  not  mutually  exclusive.  Reads  which 
observe  internal  states  violate  atomicity  requirement  that  states  internal  to  an 
operation  must  be  kept  visible  to  other  concurrently  executing  operations.  One  could 
let  each  Read  choose  more  or  less  randomly  which  final  good  state  has  been  supposedly 
observed  (“0”  or  “1”  for  example).  It  seems  that  we  have  a  solution.  However,  we  see 
that  if  we  want  to  state  properties  about  schedules  or  Reads  which  are  concurrent 


5  Some  of  these  aspects  have  been  discussed  in  Koymans  et  al  (1988). 
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with  one  single  Write,  we  must  assume  that  some  final  good  state  is  eventually  reached 
(that  is  atomicity  requirement  Al)  and  that  an  algorithm  exists  whereby  Reads  which 
have  been  truly  concurrent  with  the  Write  are  viewed  as  being  uniquely  ordered. 
Such  algorithms  are  available  in  the  literature.  We  are  left  with  the  problem  of  deciding 
how  to  guarantee  that  a  final  good  state  is  eventually  reached.  Hardware  designers 
of  conventional  circuits  have  faced  this  problem  for  many  years.  Oscillatory  and 
metastable  states  can  be  entered  for  undetermined  times  by  such  simple  devices  as 
flip-flops.  Identical  problems  are  encountered  by  VLSI  circuit  designers. 

All  proposals  which  have  been  made  to  circumvent  the  problem  consist  of  enforcing 
the  existence  of  an  upper  bound  for  state  change  durations  or  observations.  A  similar 
requirement  is  necessary  for  the  algorithms  given  towards  synchronization  of  clocks. 
Can  we  say  that  such  proposals  do  not  assume  a  lower-level  physical  solution  to  the 
initial  problem?  No,  in  the  sense  that  specific  timing  properties  must  be  assumed  for 
the  basic  operations  or  state  changes. 

Again,  time  underlies  the  concept  of  atomicity.  Problems  of  time  are  dealt  with 
explicitly  by  hardware  designers  and  designers  of  real-time  systems.  These  problems 
are  in  fact  very  general  and  should  be  carefully  addressed  in  every  system  design. 
Many  of  such  assumptions  can  be  seen  in  the  spectrum  of  real-time  languages  surveyed 
in  Shyamasundar  (1991a,  b). 

In  the  following  section,  we  consider  time  domains  used  in  the  modelling  of  real-time 
systems. 

Time  domains  -  In  the  linear  time  and  branching  time6  models  (and  other  models) 
even  though  the  term  time  is  used,  a  very  restricted  notion  of  time  is  used,  which  is 
not  satisfactory  for  real-time  systems.  In  these  models  one  can  only  say  that  whether 
two  events  took  place  at  different  points  of  time  or  not.  But  for  modelling  real-time 
systems,  we  need  to  know  the  exact  times  at  which  various  events  take  place.  A 
straightforward  way  of  incorporating  a  notion  of  time  is  to  associate  with  each  event, 
the  time  at  which  that  event  takes  place.  An  immediate  question  that  arises  is:  what 
are  the  values  time  should  take?  There  are  a  number  of  proposals: 

•  the  time  values  are  integers; 

•  time  ranges  over  real  values; 

•  time  ranges  over  a  total  order  in  which  a  distance  metric  can  be  defined. 

Proponents  of  integer  time  argue  that  the  systems  being  modelled  are  discrete  systems 
and  hence  we  need  to  consider  only  discrete  integer  values.  Though  this  is  a  good 
assumption  for  synchronous  models,  the  claim  does  not  hold  in  asynchronous  systems 
in  which  different  events  can  take  place  at  points  that  are  arbitrarily  close  to  each 
other.  In  such  systems,  the  right  time  values  one  should  use  are  real  values  or  at  least 
values  from  an  Archemedian  field. 

Concurrency  complicates  modelling  real-time  systems:  should  one  use  same  or 
different  clocks  for  events  happening  in  distinct  sub-components?  Most  of  the  models 
make  the  simplifying  assumption  that  there  is  a  global  clock  according  fo  which 
different  events  happen.  This  assumption  is  not  very  different  from  the  one  in  which 
each  concurrent  subsystem  has  its  own  clock  but  with  a  definite  relationship  between 


6  In  fact,  if  we  consider  real-time  events  the  same  actions  on  different  branches  may  not  be 
the  same;  this  would  have  to  be  integrated  with  real-time  aspects  of  the  models. 
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the  time  shown  by  different  systems.  In  some  highly  distributed  systems  involving 
autonomous  components,  the  latter  assumption  is  not  valid.  But  one  can  not  do 
reasonable  real-time  computing  without  assuming  any  relationship  between  the  clocks 
of  different  subsystems.  A  logic  of  concrete  time  intervals  has  been  developed  iri  Lewis 
(1990,  pp.  380-389)  wherein  time  delays  between  the  scheduling  and  occurrence  of 
the  events  that  cause  state  changes  are  constrained  to  fall  between  fixed  numerical 
upper  and  lower  time  bounds.  Such  an  abstraction  is  shown  to  be  useful  in  the 
modelling  of  asynchronous  systems. 

3.6b  Choice  of  concurrency  model:  Based  upon  the  various  parameters  of  concurrency 
discussed  above,  one  can  get  a  spectrum  of  semantic  models.  Broadly,  the  various 
models  can  be  categorized  into  two  distinct  classes: 

(1)  interleaving; 

(2)  true  concurrency. 

The  interleaving  model  is  simpler  as  it  reduces  concurrency  to  nondeterminism. 
Also  this  allows  uniprocessor  implementation  of  concurrency.  In  contrast,  the  second 
view  adds  an  extra  parameter  to  modelling  reactive  systems  and  hence  is  less  simple. 
But  it  is  claimed  that  it  is  the  right  view  in  decentralized  systems  involving  autonomous 
components  as  there  is  no  notion  of  a  global  ordering  of  events. 

Both  these  views  are  unsatisfactory  for  real-time  reactive  systems:  in  the  first  view, 
an  event  in  a  process  may  be  indefinitely  delayed  while  in  the  second  view  it  is  not 
known  whether  two  concurrent  events  are  executed  simultaneously  or  at  different 
times;  it  is  not  even  clear  whether  one  can  relate  the  times  of  occurrences  of  two 
concurrent  events.  In  fact,  many  of  the  models  based  on  true  concurrency  suffer  from 
the  drawback  that  it  either  enforces  complete  synchronicity  in  executions  or  does  not 
exclude  interleaving.  For  real-time  systems,  we  need  a  model  that  does  not  allow 
arbitrary  delaying  of  event  occurrences  and  that  describes  whether  two  events  are 
executed  simultaneously  or  not.  One  such  notion  is  the  maximal  parallelism  (Salwicki  & 
Muldner  1981).  Based  on  such  a  notion,  a  compositional  model  for  real-time  has 
been  advocated  in  Koymans  et  al  (1988).  Such  a  model  is  realistic  in  the  sense  that 
concurrent  actions  can  and  will  overlap  in  time  unless  prohibited  by  synchronization 
constraints;  in  other  words,  no  unrealistic  waiting  of  processors  is  modelled.  The 
following  examples  illustrate  the  intuitive  ideas  behind  this  model. 

Simple  shared  variable  model  -  Consider  the  following  program: 

IPC’-x:^  1  ||P2::x:  =  3 1|  P3::y:  =  2]. 

Let  us  assume  that  multiple  accesses  to  a  single  (shared)  variable  are  mutually  exclusive. 
Then  in  the  above  program,  either  and  P3  or  P2  and  P3  will  execute  their  first 
move  simultaneously,  but  not  Pj  and  P2. 

Distributed  program(CSP-R):  Consider  the  following  program7: 
(P1::P11::P2!0||(P12::P13!1||P13::P12?x;P2!x). 


"Here,  P  f.x  in  P2  denotes  the  waiting  of  P2  for  receiving  a  communication  from  Pt ;  similarly, 
Pf.e  in  P2  denotes  that  the  process  is  waiting  for  sending  a  value  e  to  process  Pl\  on 
handshaking,  P,!10||P2?x  results  in  x  being  assigned  e. 
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According  to  the  interleaving  semantics  the  following  two  scenarios  are  possible: 

(1)  j Pn  communicates  with  P2  while  Pl2  communicates  with  jP13;  after  that  P13 
communicates  with  P2. 

(2)  Pl2  first  communicates  with  P13  followed  by  Pl3  with  P2;  finally  Px  x  communicates 


with  P2. 


According  to  the  maximal  parallelism  semantics,  only  (1)  is  possible  since  Pu  and  P2 
can  immediately  become  involved  in  a  handshake  and  hence  do  not  wait  for  P12  and 
P13.  Ih  other  words,  in  the  distributed  computing  the  maximal  parallelism  can  be 
interpreted  to  mean  fir st-come- fir st-served. 

Now,  let  us  see  how  we  can  describe  the  maximal  parallelism  semantics  for  our 
simple  language  described  earlier.  Let  us  assume  that  the  execution  of  all  basic  actions 
takes  the  same  amount  of  time. 


pfi,q^q' 


p\\q-{h}p\W 


In  other  words,  a  process  cannot  wait  unnecessarily.  Since,  enablement  implies  that 
there  must  be  a  processor  for  every  process,  one  can  see  that  the  basic  maximal 
parallelism  model  assumes  that  there  are  as  many  processors  (machines)  as  there  are 
parallel  components  in  the  system.  This  assumption  can  however  be  relaxed  by 
relaxing  the  requirement  that  event  should  occur  not  immediately  but  with  bounded 
delay.  These  aspects  have  been  addressed  in  Koymans  et  al  (1988).  In  fact,  it  is  also 
possible  to  relax  the  requirement  of  one  processor  for  every  process  by  modelling 
scheduling  (in  a  restricted  manner  these  have  been  addressed  in  Liu  (1989)  and  Liu  & 
Shyamasundar  (1990,  pp.  21-26)). 

Now,  let  us  analyse  the  question:  Does  maximal  parallelism  provide  a  good  model 
for  real-time  systems ?  Though  the  model  is  realistic  in  a  sense  it  suffers  from  some 
conceptual  problems.  This  is  illustrated  by  the  following  example  illustrated  in 
figure  14(A). 

Consider  a  network  with  distributed  control,  and  two  processes  A  and  B  in  different 
nodes  that  want  to  communicate  with  a  process  C  in  a  third  node.  If  A  wants  to 


communication  medium 


(A) 


(B) 


Figure  14.  Effect  of  topology  and  communication  medium. 
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communicate  at  an  earlier  time  than  B,  relative  to  some  global  time  scale,  then 
according  to  the  first-come-first-served  ( fcfs )  principle,  indeed,  A  should  communicate 
first.  Whether  A’s  message  arrives  in  C  before  B’s  message  or  not,  depends  on  the 
topology  of  the  network.  So,  imposing  an  fcfs  principle  upon  the  order  of 
communications  induces  non-trivial  requirements  upon  an  underlying  communication 
layer  requirement  that  we  would  like  not  to  make.  Similar  problems  occur  if 
processors  communicate,  for  example,  via  a  common  bus  where  assumptions  about 
bus-arbitration  have  to  be  taken  into  account.  The  lesson  that  should  be  drawn  from 
this  example  is  that,  whereas  the  maximal  parallelism  model  applies  the  fcfs- principle 
to  the  order  of  initiation  of  requests,  the  principle  should  rather  be  applied  to  the 
order  in  which  a  process  becomes  aware  of  requests.  In  doing  so,  we  create  the  freedom 
to  relax  the  stringent  impositions  of  the  original  model  on  the  behaviour  of  a 
communication  layer.  Specifically,  in  this  way  it  becomes  possible  to  vary  the  time-gap 
(0  in  the  original  model)  between  the  initiation  and  receipt  of  a  communication  request, 
which  reflects  the  uncertainties  about  the  communication  layer.  This  variation  of  the 
time  gap  is  the  essential  feature  of  the  MAXy(5,  s)  model  of  distributed  concurrency. 
The  parameters  <5  and  s  function  as  lower  and  upper  bounds  on  the  above  time  gaps 
which  are  allowed  to  take  on  any  value  in  between  these  bounds.  As  a  consequence, 
communications  that  are  initiated  too  close  in  time  (relative  to  the  global  clock) 
cannot  be  temporally  ordered  anymore.  These  time  bounds  may  be  interpreted  as  an 
abstraction  of  the  propagation  delays  within  some  communication  layer.  The  third 
parameter,  y,  of  the  model  is  used  to  extend  communications  in  time  and  denotes  the 
number  of  time  units  it  takes. 

In  the  MAXy(S,  e)  (see  figure  14B),  it  is  assumed  that  there  is  no  unnecessary  waiting 
between  the  execution  of  actions.  Communication  between  processes  is  served  on  a 
first-come-first  served  basis.  Additionally,  the  following  model  pertains  to  process- 
communication: 

•  processes  communicate  via  a  medium. 

•  it  takes  between  5  and  e  time  units  (8  not  included)  for  the  medium  to  become 
aware  of  a  process  expressing  its  willingness  to  communicate  or  withdrawing  its 
willingness  (time-out). 

•  communication  between  two  processes  only  occurs  after  the  medium  has  become 
aware  of  both  processes’  willingness. 

•  a  communication  takes  an  additional  y  time  units  during  which  period  the  processes 
remain  synchronized. 

•  a  communication  that  is  in  progress  at  a  time  when  the  medium  receives  a  time-out 
from  one  of  the  participating  processes,  will  be  completed;  a  communication  that 
might  be  started  at  such  a  time,  will  not  be  executed. 

The  formal  details  of  these  models  are  discussed  in  Koymans  et  al  (1988). 

There  is  another  model  of  reactive  systems  referred  to  as  the  strong  synchronous 
model  that  is  useful  when  there  is  no  need  of  explicit  clock.  According  to  strong 
synchrony  hypothesis,  any  event,  be  it  a  communication  event  between  two  distant 
machines  or  a  local  event  occurring  within  a  single  machine,  takes  place  instan¬ 
taneously.  Obviously,  this  hypothesis  is  not  valid  for  large  systems  extended  in  space. 
However,  this  is  a  very  useful  simplifying  assumption  for  embedded  systems  occupying 
small  space. 

Having  chosen  a  concurrency  model,  the  other  important  aspect  that  needs  to  be 
looked  into  is:  To  what  extent  should  the  real-time  aspects  be  incorporated  in  the 
given  concurrency  model.  An  important  question  that  arises  is: 
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Is  there  need  for  a  special  status  of  time  or  is  it  another  parameter  of  the  state? 
In  fact,  such  a  question  has  already  been  studied  in  the  modelling  of  dynamic  systems 
where  a  special  status  is  accorded  to  time  (based  on  which,  one  gets  various  classes 
of  equations).  In  the  same  way,  even  in  the  context  of  programming,  it  appears 
necessary  to  accord  a  special  status  to  the  “time”  parameter  in  order  to  specify  various 
real-time  properties.  It  may  be  noted  that  the  parameter  “time”  is  already  distinguished 
by  the  fact  that  it  is  continuous,  monotonic  and  divergent. 

On  the  whole,  a  real-time  model  for  concurrency  depends  upon: 

(1)  what  can  be  specified/proved  in  the  given  model  of  time? 

(2)  what  is  the  complexity  of  the  decision  procedures? 

(3)  what  is  the  relation  of  the  established  property  to  the  physical  system  property? 


4.  Challenges  in  the  design  of  real-time  systems 

The  challenges  that  underlie  the  design  of  real-time  systems  can  be  broadly  categorized 
into: 

•  specification  and  verification  of  real-time  programs; 

•  real-time  programming  languages; 

•  systematic  development  of  real-time  programs; 

•  real-time  scheduling; 

•  tools  for  the  design  of  communication  protocols. 

In  the  following,  we  discuss  these  aspects  in  detail. 

4.1  Specification  and  verification  of  real-time  programs 

Specification  formalisms  are  central  to  the  problem  of  developing  safe  reliable  real-time 
distributed  systems.  Handling  real-time  will  not  only  require  the  development  of 
specification  and  development  frameworks  but  might  also  require  a  revision  of  the 
basic  models  that  have  been  used  so  far  in  dealing  with  concurrency.  One  of  the  main 
goals  of  any  specification  formalism  would  be  to  bridge  (or  narrow)  the  gap  between 
specification  and  implementation.  The  next  question  is:  What  are  the  general 
properties  for  any  candidate  formalism?  Obviously,  the  formalism  should  support 
compositional  verification.  That  is,  it  should  be  possible  to  verify  the  specification  of 
a  program  based  entirely  on  the  specification  of  its  constituent  components  without 
looking  into  the  interior  structure  of  the  components.  In  fact,  it  is  preferable  to  support 
the  stronger  notion  of  modularity  (cf.  Zwiers  1988)8.  Of  course,  any  automated  (even 
partial)  support  environment  would  be  a  welcome  feature  of  any  method  for  the 
design  of  a  complex  system.  Generally  speaking  one  should  address  the  following 
questions: 

•  Given  a  model,  find  the  most  suitable  formalism  in  which  to  express  a  given 
property. 


8  For  compositionality,  one  requires  that  from  a  given  complete  program  specification  it  is 
necessary  to  establish  the  existence  of  specifications  of  the  components  from  which  the  complete 
specification  can  be  deduced.  However,  for  modularity,  one  has  to  establish  that  a  deduction 
of  the  complete  program  (or  specification)  is  possible  from  a  given  a  priori  specification  of  the 
components.  Needless  to  say,  the  latter  goes  naturally  with  the  philosophy  for  the  design  of 
large  programs. 
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•  For  deriving  manageable  verification  techniques,  it  is  necessary  to  build  a  tradeoff 
among  ease  of  expression,  generality  and  amenability.  It  may  be  observed  that  the 
more  general  the  specification  the  easier  it  is  to  specify;  however,  the  associated 
verification  method  will  become  harder.  As  in  any  area,  researchers  have  considered 
subsets  of  the  general  problem  and  devised  nice  techniques  (for  a  survey,  see 
Shyamasundar  1991a,  b).  We  have  already  seen  the  various  issues  of  modelling 
real-time  systems  in  the  earlier  sections. 

Most  of  the  existing  works  make  the  simplifying  assumption  that  there  is  a  global 
clock  according  to  which  different  events  happen.  This  assumption  is  not  very  different 
from  the  one  in  which  each  concurrent  subsystem  has  its  own  clock  but  with  a  definite 
relationship  between  the  time  shown  by  different  systems.  In  some  highly  distributed 
systems  involving  autonomous  components,  the  latter  assumption  is  not  valid.  But 
one  can  not  do  reasonable  real-time  computing  without  assuming  any  relationship 
between  the  clocks  of  different  subsystems.  Coming  to  modelling,  the  approach  of 
modelling  concurrency  via  nondeterminism  can  be  immediately  ruled  out  from 
considerations  of  predictability.  However,  it  is  necessary  to  model  the  nondeterministic 
environment.  The  work  on  real-time  systems  can  be  broadly  divided  into  the  following 
two  streams: 

(i)  Strongly  synchronous  systems  -  Here,  interaction  between  the  components  of  the 
systems  as  well  as  the  environment  is  synchronous  and  instantaneous,  control  or 
communication  does  not  take  any  time,  and  further,  there  is  explicit  notion  of  clock. 
Further,  nondeterminism  is  completely  ruled  out.  In  a  sense,  the  focus  here  is  on  ideal 
system  behaviour  as  in  some  parts  of  engineering  and  mathematics. 

(ii)  Asynchronous  distributed  systems  -  Here,  the  interactions  are  asynchronous  and 
take  arbitrary  but  bounded  (it  can  vary  between  some  limits)  time. 

But  in  practice,  systems  are  neither  purely  synchronous  nor  purely  distributed. 
Some  layers  (parts)  of  the  systems  will  be  synchronous  while  certain  other  layers 
(parts)  will  be  asynchronous.  A  robot  is  a  typical  abstraction  of  such  a  system.  A 
robot  consists  of  a  number  of  sensor/actuator  components  -  one  for  each  of  its  hands 
and  legs,  a  sensor  to  see,  a  sensor  to  hear  and  so  on.  Each  of  these  sensors  is  localized 
and,  hence,  a  strong  assumption  about  synchronicity  is  viable.  In  order  for  the  robot 
to  do  globally  meaningful  tasks  (like  moving  around  space  avoiding  obstacles,  moving 
objects  from  one  place  to  another)  all  the  sensors  in  its  body  will  have  to  interact 
with  each  other.  Since  these  sensors  are  distributed  over  the  entire  body  of  the  robot 
the  communication  delay  between  them  will  be  appreciable  and  cannot  be  ignored. 
Hence  the  interaction  between  the  sensors  will  have  to  be  modelled  by  asynchronous 
communication.  Further,  in  the  modelling  of  real-time  systems  it  becomes  necessary 
to  model  nondeterminism  due  to  the  environment;  note  that  it  should  also  be  possible 
to  capture  the  predictable  (does  not  necessarily  mean  deterministic)  requirements  of 
the  real-time  systems.  In  short,  a  unified  integrated  approach  of  strongly  synchronous 
and  asynchronous/synchronous  will  provide  a  nice  formalism  for  the  specification  of 
real-time  distributed  systems.  Such  an  approach  will  also  throw  light  on  unification 
of  the  various  theories  of  concurrency.  It  may  be  noted  that  the  unification  also 
requires  refinements  of  the  semantics/the  proof  theory  of  the  strongly  synchronous 
and  asynchronous  distributed  systems. 
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4.2  Real-time  programming  languages 

One  of  the  main  goals  of  a  programming  language  is  to  provide  a  natural  vehicle  for 
expressing  good  ideas  elegantly.  However,  if  we  look  at  a  large  spectrum  of  real-time 
languages,  the  languages  do  not  reflect  any  evolution  with  respect  to  assembly 
languages.  However,  the  scene  is  changing  rapidly  and  low-level  programming 
techniques  will  not  remain  acceptable  for  large  safety-critical  systems  (cf.  Berry  1989). 
Real-time  programming  will  follow  the  modern  tendency  to  make  systems  hardware 
independent:  software  has  a  longer  lifetime  than  hardware.  Some  of  the  main  issues  of 
research  are: 

(1)  expressibility  of  timing  requirements; 

(2)  exception  handling  mechanisms; 

(3)  efficiency; 

(4)  formal  semantics  and  verifiability  -  it  is  important  to  consider  realistic  models  of 
communication,  concurrency  and  time.  Further,  the  semantics  must  account  for 
resource  limitations  in  a  natural  way; 

(5)  an  integrated  environment  for  the  development  of  real-time  programs  -  from  the 
point  of  view  of  reliability  and  robustness,  it  is  very  essential  to  provide  analysis 
tools  for  timing  and  functional  analysis  of  the  components.  In  fact,  mechanical 
support  (with  possibly  graphical  support)  is  very  necessary  for  the  wide  acceptance 
of  any  language  for  programming  large  systems; 

(6)  reliability  and  fault  tolerance  of  programs  -  it  is  important  to  obtain  a  proper 
tradeoff  between  hardware  and  software  to  cater  to  a  variety  of  applications; 

(7)  object-oriented  paradigm  -  as  discussed  already,  flexibility  is  one  of  the  most 
important  factors  in  the  design  of  real-time  systems.  Object-oriented  programming 
perhaps  would  provide  a  good  insight  into  these  aspects.  Broadly,  an  object-oriented 
program  consists  of  objects  and  methods.  An  object  may  ask  for  methods  defined 
in  it  or  in  other  objects.  In  other  words,  one  can  define  methods  based  on  various 
criteria  (perhaps  including  performance  criteria)  and  the  system  can  call  the 
appropriate  methods  based  on  the  need.  Such  a  design  will  go  a  long  way  in 
providing  a  basis  for  portability  satisfying  timing  constraints  and  would  support 
even  bottom-up  techniques  of  building  systems.  Shyamasundar  et  al  (1991) 
have  shown  formally  that  object-oriented  programming  is  viable  for  real-time. 

4.3  Systematic  development  of  real-time  programs 

A  sound  methodology  should  enable  one  to  arrive  at  a  correct  real-time  program 
from  their  high  level  specifications.  It  must  however  be  noted  that  from  the  point  of 
view  of  deriving  correct  implementations  from  a  specification  it  is  just  not  sufficient 
to  concentrate  solely  on  functional  or  the  temporal  requirements.  The  possible 
implementations  of  a  real-time  system  are  quite  often  restricted  by  the  configuration 
and  resources  of  the  execution  mechanism  that  will  be  used  to  run  the  system.  Thus, 
in  order  to  judge  the  feasibility  of  the  implementation  derived  from  the  specification 
it  is  necessary  to  formalize  the  properties  of  the  execution  mechanisms  that  will  be 
used  to  run  the  system.  Hence,  apart  from  temporal  requirements,  paradigms  of 
real-time  systems  also  have  to  express  implementation  specific  characteristics  such  as: 
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(i)  multiprocessor/microprocessor/sequential,  (ii)  scheduling  policies  such  as  fixed 
priority,  dynamic  priority,  round  robin,  time  slicing  etc.  and  (iii)  the  mechanism  for 
the  interaction  with  the  environment  such  as  interrupts  or  polling.  That  is,  the  high 
level  specifications  will  state  the  timing  properties  and  other  implementation 
characteristics  or  properties  of  the  programs  being  developed,  while  the  final  programs 
derived  using  the  methodology  will  be  the  ones  satisfying  these  constraints.  In  fact, 
transformational  methodology  would  have  all  the  advantages  of  the  traditional 
stepwise  refinement  methodology.  To  find  the  right  level  of  abstraction  for  describing 
the  implementation-specific  characteristics,  it  is  essential  for  deriving  implementations 
from  specifications.  This  is  a  major  research  problem. 

4.4  Real-time  scheduling 

One  can  view  a  real-time  system  as  a  set  of  tasks  or  as  a  set  of  periodic  and  sporadic 
processes.  Thus,  it  is  very  essential  to  use  efficient  scheduling  strategies  for  meeting 
the  resource  and  timing  constraints.  Most  of  the  scheduling  algorithms  have  the 
following  drawbacks: 

(1)  most  of  them  are  intractable,  or 

(2)  most  of  the  algorithms  require  that  the  component  characteristics  be  known  a 
priori  and  limit  themselves  to  uniprocessor/multiprocessor  configurations. 

However,  a  large  spectrum  of  process  control  tasks  are  inherently  distributed  with 
several  hard  real-time  constraints.  Thus,  it  looks  imperative  to  look  for  scheduling 
algorithms  (cf.  Stankovic  1988)  with  good  heuristics  to  derive  efficient  scheduling 
algorithms  in  the  context  of  parameters  such  as  (i)  static  vs  dynamic  scheduling, 

(ii)  centralized  vs  distributed  systems,  (iii)  hard  vs  soft  deadlines,  (iv)  preemptable  vs 
non-preemptable  tasks,  (v)  fault  tolerance  etc. 

4.5  Tools  for  communication  protocols 

Typically,  real-life  protocols  can  be  considered  to  be  a  coordinated  set  of  simple 
programs  that  are  often  time-dependent.  There  have  been  nice  formalisms  such  as 
LOTOS  for  specifying  protocols  and  workbenches  for  verifying  them.  However,  the 
formalisms  lack  the  power  to 

(1)  express  timing  constraints  such  as  minimal,  maximal,  durational  etc; 

(2)  specify  interrupts  and  priorities. 

These  features  are  very  essential  since  predictability  is  an  important  aspect  of  protocols. 
These  features  can  be  seen  in  the  language  RT-CDL  (Liu  &  Shyamasundar  1989, 
pp.  21-26)  designed  from  the  point  of  view  of  modelling  general  real-time  reactive 
systems.  With  the  ever-increasing  use  of  protocols  in  various  walks  of  life,  it  is 
important  to  arrive  at  formalisms  that  enable  the  overcoming  of  the  above  drawbacks. 
In  fact,  any  formalism  for  protocols  should  be  supported  by  a  nice  set  of  tools  that 
enable  the  users  to  formally  derive  and  verify  them.  It  may  be  noted  that  the  protocols 
are  not  necessarily  finite  state.  However,  a  large  class  of  them  are  finite  state.  Thus, 
in  developing  automated  tools,  it  is  necessary  to  look  into  aspects  of  how  much  of 
the  non-finite  state  systems  can  also  be  handled. 
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5.  Conclusions 

In  the  previous  sections,  we  have  articulated  real-time  systems  as  systems  that  maintain 
a  temporal  relationship  with  an  uncooperative  environment.  We  have  discussed  the 
various  issues  of  modelling  concurrency,  time  and  communication  together  and  shown 
the  various  possible  process  equivalences.  The  choice  depends  on  the  observable 
entities  and  also  on  the  application. 

Further,  we  have  argued  that  real-time  systems  have  posed  a  wide  spectrum  of 
challenges  to  the  computing  community  and  highlighted  the  challenges  in  building 
real-time  systems.  To  meet  the  challenge  it  is  very  essential  to  crystallize  the 
behavioural  model  of  real-time  systems  using  realistic  models.  To  sum  up,  one  of  the 
most  immediate  needs  is  the  discovery  of  specification  formalisms  that  can  be 
embedded  in  an  hierarchical  method  of  refinement.  Of  course,  for  the  success  of  a 
sound  methodology  it  is  very  essential  to  arrive  at  a  proper  tradeoff  among  the 
notions  of  time,  engineering  limitations  and  physical  abstractions.  From  an  engineering 
point  of  view,  there  is  a  need  to  strike  a  nice  balance  between  an  ideal  system  and 
an  actual  system  to  derive  a  nice  methodology  for  designing  real-time  systems. 
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Abstract.  Formal  methods  to  specify  and  verify  concurrent  programs  with 
synchronous  message  passing  are  discussed.  We  stress  the  development 
towards  compositional  methods,  i.e.  methods  in  which  the  specification 
of  a  compound  program  can  be  inferred  from  specifications  of  its 
constituents  without  reference  to  the  internal  structure  of  those  parts. 
Compositionality  enables  verification  during  the  process  of  (top-down) 
design  -  the  derivation  of  correct  programs  -  instead  of  the  more  familiar 
a-posteriori  verification  based  on  already  completed  program  codes. 
We  sketch  the  transition  from  non-compositional  towards  compositional 
methods  for  concurrent  programs,  indicating  the  main  principles  behind 
compositionality.  Having  achieved  a  compositional  framework  based  on 
classical  Hoare  triples,  we  discuss  extensions  to  achieve  a  convenient 
formalism  to  specify  and  verify  reactive  systems  that  have  an  intensive 
interaction  with  their  environment.  Next  this  Hoare-style  framework  is 
adapted  to  specify  and  verify  real-time  properties,  and  a  compositional 
proof  method  is  formulated  for  real-time  distributed  computing.  Composi¬ 
tional  reasoning  during  top-down  development  of  a  real-time  program  is 
illustrated  by  an  example  concerning  a  watchdog  timer. 

Keywords.  Compositional  methods  for  concurrency;  real-time  applica¬ 
tions;  distributed  computing. 


1.  Introduction 

Formal  methods  for  the  specification  and  verification  of  distributed  systems  can  be 
classified  from  the  viewpoint  of  expressibility  (which  properties  can  be  specified), 
specification  language  (e.g.,  temporal  logic,  Hoare  triples  and  first-order  assertions), 
and  programming  features  (such  as  time-out,  various  communication  mechanisms 
and  concurrency).  In  this  paper  we  concentrate  on  the  distinction  between  proof 
methods  that  are  only  applicable  to  complete  program  code  and  methods  that  can 
be  used  to  verify  design  steps  during  the  process  of  program  development.  We  sketch 
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the  development  from  a-posteriori  methods  (requiring  the  complete  program  text) 
towards  compositional  methods  (supporting  verify-while-design).  Compositionality 
can  be  considered  as  a  requirement  for  hierarchical,  structured,  program  derivation. 
A  separation  of  concerns  is  desired  between  the  use  of  (and  the  reasoning  about)  a 
module  and  its  implementation.  This  leads  to  the  following  definition  of  compositionality 
for  proof  methods: 

Properties  of  a  compound  programming  language  construct  (such  as  sequential 
composition  and  parallel  composition)  can  be  deduced  from  specifications  for  its 
constituent  parts  without  any  further  information  about  the  internal  structure  of 
these  parts. 

In  general,  compositional  program  specification  and  verification  dictates,  as  a 
principle,  that  all  aspects  of  program  execution  which  are  required  to  define  the 
meaning  of  a  compound  statement  from  its  constituents,  must  be  explicitly  addressed 
in  semantics  and  assertion  language  alike.  In  semantics  because,  otherwise,  no 
compositional  semantics  can  be  defined,  since  compositionality  in  semantics  requires 
that  the  meaning  of  a  compound  statement  is  a  function  of  the  meaning  of  its  parts 
(the  guiding  principle  of  denotational  semantics).  In  specification  languages  because, 
otherwise,  no  compositional  verification  rules  can  be  formulated  in  which  the 
specification  of  a  compound  statement  should  follow  from  specifications  of  its 
constituent  parts  without  knowledge  about  their  internal  structure  (the  internal 
structure  often  providing  implicit  information  which  has  not  been  explicitly  stated 
in  the  specification,  but  is  used  in  non-compositional  methods  (Owicki  &  Gries  1976; 
Apt  et  al  1980)).  The  rationale  for  this  principle  is  that  one  must  be  able  to  specify 
the  behaviour  of  a  module  in  isolation,  i.e.  without  any  implicit  prior  assumption 
regarding  the  environment  within  which  it  ultimately  functions.  Hence,  all  assumptions 
which  are  needed  regarding  the  environment-because  these  influence  the  behaviour 
of  a  module-must  be  made  explicit  as  parameters  (in  the  semantics  and  specification 
of  that  module  alike)  for  only  then  one  can  abstract  away  from  the  remaining  aspects 
(such  as  inner  syntactic  structure). 

In  case  of  shared  variable  communication  this  compositionality  principle  implies 
that  when  defining  the  behaviour  of  a  module  any  change  of  a  shared  variable  by 
the  environment  must  be  explicitly  expressed  as  an  assumption  of  that  module 
regarding  its  environment.  This  is  worked  out  in  Aczel’s  model  for  shared  variable 
semantics  as  cited  in  de  Roever  (1985b,  pp.  181-207).  Similarly,  when  considering 
distributed  communication  via  input/output-statements,  the  specification  of,  e.g.,  an 
input  statement  in  one  module  requires  explicit  expressibility  of  assumptions  regarding 
a  corresponding  output  statement  in  another  module.  In  case  one  abstracts  away 
from  blocking  behaviour  only  assumptions  regarding  the  value  communicated  must 
be  expressible.  If  blocking  behaviour  is  a  focus  of  interest,  this  is  again  an  assumption 
regarding  program  execution  which  must  be  stated  explicitly;  i.e.  one  has  to  state  the 
effect  of  no  communication  partner  being  available  in  the  assertion  language  and  one 
must  be  able  to  express  the  assumption  that  no  partner  is  available  in  the  assertion 
language. 

In  this  paper  we  also  discuss  the  compositional  verification  of  real-time  properties 
for  distributed  systems.  When  the  timing  behaviour  of  a  statement  is  considered,  all 
factors  concerning  the  execution  of  this  statement  which  influence  that  timing 
behaviour  must  be  expressible.  For  example,  for  real-time  systems  we  use  in  this 
paper  the  maximal  progress  assumption  with  respect  to  distributed  i/o-communication 


Compositional  methods  for  concurrency 


31 


from  Koymans  et  al  (1988):  no  input  or  output  statement  should  wait  for  communica¬ 
tion  when  its  partner  is  also  ready  to  communicate.  This  aspect  of  timing  behaviour 
requires,  indeed,  that  one  must  be  able  to  express  when  a  partner  is  waiting  to 
communicate.  For,  otherwise,  maximal  progress  would  not  be  expressible  within  the 
semantics,  and  hence  timing  behaviour  of  i/o-statements  could  not  be  characterized. 
This  maximal  progress  assumption,  which  represents  the  situation  that  each  process 
has  its  own  processor,  can  be  generalized  to  multiprogramming  where  several 
processes  may  share  a  single  processor.  By  introducing  priorities  for  processes  on  a 
single  processor,  certain  statements  which  are  ready  to  execute  will  not  be  executed 
on  account  of  their  priority  and  because  at  most  one  action  can  be  executed  at  a 
time  on  a  uniprocessor.  Modelling  the  timing  behaviour  of  such  statements  requires 
that  the  semantics,  and  hence  the  specification  language,  contains  primitives  to  state 
explicitly  when  a  statement  is  executing  and  when  it  is  requesting  processor  time  with 
a  certain  priority.  The  semantic  aspects  of  reasoning  formally  about  real-time  and 
scheduling  by  means  of  priorities  are  addressed  technically  in  Hooman  (1991a). 

This  paper  is  structured  as  follows.  A  programming  language  with  synchronous 
message  passing  is  defined  in  §2.  Section  3  contains  a  description  of  a  classical 
non-compositional  method,  and  we  indicate  how  a  compositional  proof  system  can 
be  achieved  for  Hoare  triples  (pre-condition,  program,  post-condition).  For  the 
specification  and  verification  of  reactive  systems,  these  triples  are  extended  in  §  4  with 
assertions  (called  assumption  and  commitment)  that  specify  the  communication 
interface  between  a  program  and  its  environment.  In  §5  we  adapt  this  Hoare-style 
framework  to  specify  real-time  properties  of  programs,  and  we  give  the  details  of  a 
compositional  proof  system  for  real-time  distributed  systems.  The  formalism  of  §  5  is 
illustrated  by  an  example  of  a  watchdog  timer  in  §  6.  The  extension  of  this  formalism 
to  assumption/commitment  based  reasoning  for  real-time  is  described  in  §  7.  In  §  8 
we  sketch  the  development  of  the  field,  leading  to  a  description  of  the  state  of  the  art 
and  the  place  of  our  work  therein. 

2.  Syntax 

We  give  syntax  and  informal  semantics  of  a  programming  language  for  distributed 
synchronous  message-passing.  Our  language  is  akin  to  Occam  (1988)  with  concurrent 
processes  that  communicate  via  message  passing  along  unidirectional  channels,  each 
connecting  two  processes.  Communication  is  synchronous,  i.e.,  both  the  sender  and 
the  receiver  have  to  wait  until  a  communication  partner  is  available. 

Let  CHAN  be  a  nonempty  set  of  channel  names,  VAR  be  a  nonempty  set  of  program 
variables,  and  VAL  be  a  denumerable  domain  of  values.  N  denotes  the  set  of  natural 
numbers  (including  0).  The  syntax  of  our  programming  language  is  given  in  table  1, 
with  nef\J,  n  ^  1,  c,cl,...,cneCHAN,  x,xl,...,x„e  VAR,  and  $e  VAL. 

Table  1.  Syntax  programming  language. 

Expression  e ::  =  S  \  x\ex  +  e2\e1  +  e2\e1  x  e2 

Boolean  expression  b ::  =  ex  =  e2  \  ex  <  e2  \  ~ i  b  \  bx  V  b2 

Statement  S ::  =  x:  =  e  \  cle  |  c?x  |  ;S2 1  G  |  *  G  \  ||  S2 

Guarded  command  G  ::  =  [[]"=  x  b(  -*■  S,]  |  [Q"=  j  bp,  c{lxt  -*■  S,] 


Informally,  the  statements  of  our  programming  language  have  the  following  meaning. 
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Atomic  statements 

•  Assignment  x:  =  e  assigns  the  value  of  expression  e  to  the  variable  x. 

•  Output  statement  c\e  is  used  to  send  the  value  of  expression  e  on  channel  c  as 
soon  as  an  input  command  c?x  is  available.  Since  we  assume  synchronous 
communication,  such  an  output  statement  is  suspended  until  a  parallel  process 
executes  a  corresponding  input  statement. 

•  Input  statement  c?x  is  used  to  receive  a  value  via  channel  c  and  assign  this  value 
to  the  variable  x.  As  for  the  output  command,  such  an  input  statement  has  to  wait 
for  a  corresponding  partner  before  a  (synchronous)  communication  can  take  place. 

Henceforth  we  will  often  refer  to  an  input  or  output  statement  as  an  i/o-statement. 

Compound  statements 

•  Sj;  S2  indicates  sequential  composition:  first  execute  Sl5  and  continue  with  the 
execution  of  S2  if  and  when  Sl  terminates. 

•  Guarded  command  [Q"=ibj-»Sf].  If  none  of  the  h,  evaluate  to  true  then  this 
guarded  command  terminates  after  evaluation  of  the  booleans.  Otherwise,  non- 
deterministically  select  one  of  the  bt  that  evaluates  to  true  and  execute  the 
corresponding  statement  St: 

•  Guarded  command  [0”=i  bp,  cf?x{ -►£{].  A  guard  (the  part  before  the  arrow)  is 
open  if  its  boolean  part  evaluates  to  true.  If  none  of  the  guards  is  open,  the  guarded 
command  terminates  after  evaluation  of  the  booleans.  Otherwise,  wait  until  the 
communication  of  one  of  the  open  guards  can  be  performed  and  continue  with 
the  corresponding  S;. 

•  Iteration  *G  indicates  repeated  execution  of  guarded  command  G  as  long  as  at 
least  one  of  the  guards  is  open.  When  none  of  the  guards  is  open  *G  terminates. 

•  5t  \\S2  indicates  parallel  execution  of  the  statements  Sy  and  S2.  The  components 

and  S2  of  a  parallel  composition  are  often  called  processes. 

Henceforth  we  use  =  to  denote  syntactic  equality.  Conventional  abbreviations  are 

used,  such  as  true  =  0  =  0,  false  -  ~ i  true ,  b ]  A  b2=~\(~\bl  V  ~ i  b2)  etc. 

For  a  guarded  command  G  =  [0"=i^i_>^i]  or  G  =  [Q'I=ihi;  c^x.-^S,],  we  define 

bG  =  b{  V  ...  V  b„.  Observe  that  conventional  programming  constructs  can  be  defined 

as  an  abbreviation: 

if  b  then  else  S2  fi  =  [b-^S1  Q- 1  b->S2]  and  while  b  do  S  od  = 


3.  Compositionality 

In  §3.1  we  explain  the  principles  of  traditional  non-compositional  methods.  The 
development  towards  compositional  proof  systems  based  on  Hoare  triples  is  described 
in  §  3.2. 

3.1  N on-compositional  methods 

Classical  verification  methods  for  parallel  processes,  such  as  Owicki  &  Gries  (1976) 
for  shared  variable  communication  and  Apt  et  al  (1980),  Levin  &  Gries  (1981)  for 
synchronous  message  passing,  consist  of  two  stages.  First  a  local  correctness  proof  is 
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given  for  each  of  the  sequential  process  by  associating  assertions  with  locations  in 
the  program.  In  the  second,  global,  stage  a  consistency  check  is  applied  to  the  local 
proofs. 

•  For  shared  variables  this  is  the  interference  freedom  test  which  verifies  that 
assertions  in  the  proof  of  one  process  remain  valid  under  actions  of  other  processes. 

•  For  communication  via  message  passing  the  cooperation  test  is  applied  to  verify 
correctness  of  assertions  attached  to  locations  after  input-  and  output-statements. 

Such  methods  are  not  compositional  because  at  parallel  composition  they  require 
the  complete  program  text,  annotated  with  assertions,  of  the  constituent  processes. 
Moreover,  they  are  only  suited  for  top-level  parallelism,  that  is,  to  prove  correctness  of 
programs  of  the  form  || . . . || S„  where  are  sequential  processes. 

As  an  example,  we  consider  in  more  detail  the  method  of  Apt  et  al  (1980)  for 
synchronous  message  passing.  This  method  is  based  on  Hoare  triples  (Floare  1969) 
that  is,  on  correctness  formulae  of  the  form  {p}  S  {q}  which  have  the  following 
meaning:  if  we  start  program  S  in  a  state  satisfying  assertion  p  (the  pre-condition) 
and  if  program  S  terminates  then  assertion  q  (the  post-condition)  holds  for  the 
termination  state.  For  example,  {x  =  5}x:  =  x  +  1  (x  =  6}  is  a  valid  Hoare  triple. 

First  we  indicate  how  a  proof  system  can  be  formulated  in  which  valid  Hoare 
triples  can  be  derived  for  sequential  programs.  Let  <?[e/x]  denote  the  textual 
substitution  of  expression  e  for  each  free  occurrence  of  variable  x  in  assertion  q.  Then 
we  have  the  following  assignment  axiom: 

Axiom  3.1.  (Assignment) 

{q[e/x]}x:  =  e{q}. 

Example  3.1.  With  this  axiom  we  can  derive  {x  =  5}x:  =  x  -f  1  (x  =  6},  because 
(x  =  6)[x  -l-  1/x]  equals  x  +  1  =  6,  which  is  equivalent  to  x  —  5. 

Furthermore  the  proof  system  contains  rules  for  compound  constructs.  For  instance, 
sequential  composition  is  modelled  by  the  following  rule: 

Rule  3.2.  (Sequential  composition) 

{p}S1{r},  {r}S2{q} 

{p}S1;S2{q} 

By  such  a  rule  the  formula  below  the  line  can  be  derived  from  the  formulae  above 
the  line.  Soundness  of  the  rule  is  proved  by  showing  that  validity  of  the  formulae 
above  the  line  implies  that  the  formula  below  the  line  is  valid.  Note  that  this  rule  is 
compositional  because  the  formula  for  Sx ;  S2  is  derived  without  using  the  structure 
of  Sj  or  S2 ■  To  strengthen  pre-conditions  and  weaken  post-conditions,  the  proof 
system  contains  the  following  rule: 

Rule  3.3.  (Consequence) 

{P}S{4} 
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To  illustrate  the  rule  for  parallel  composition  in  Apt  et  al  (1980),  we  consider  the 
proof  of 

{y  =  3}  (a?x;  x:  =  x  +  1;  b!(x  +  2))||(o!y;  bly;  y:  =  y  +  2){x  =  4  Ay  =  8}. 

In  the  first  stage  we  attach  assertions  to  all  locations  in  the  program  text  of  the 
two  processes,  leading  to  so-called  proof  outlines : 

{true}a?x{x  =  3};  x:  =  x  +  1  (x  =  4};  bl(x  +  2){x  =  4}, 

and 

{y  =  3}«!y{y  =  3};  bly{y  =  6};  =  y  +  2{y  =  8}. 

In  this  stage  only  the  post-conditions  of  assignments  are  verified:  from  the  assignment 
axiom  we  obtain  {x  =  3}x:  =  x  +  1  {x  =  4}  and  {y  =  6}y:  =  y  +  2{y  =  8}.  Observe  that 
the  post-conditions  of  the  input  statements  a?x  and  h?y  express  assumptions  about 
the  values  sent  by  the  communication  partner. 

These  assumptions  are  verified  in  the  second  stage  by  means  of  the  cooperation 
test.1  In  general,  this  test  requires  that  for  {Pi}c?x{<j1}  and  {p2}  c\e{q2)  in  the  proof 
outlines  of  two  processes  we  have  to  prove  [pt  A  p2}c?x||c!e{g1  A  q2},  which  is 
equivalent  to  proving  {p2  Ap2}x:  =  e{^  A  q2}.  In  our  example  this  leads  to  the  proof 
obligations: 

{true  A  y  =  3}a?x||a!y{x  =  3  A  y  =  3} 

and 

(y  =  3  Ax  =  4}  bly  ||  b\ (x  +  2)  (y  =  6  A  x  =  4}  which  are  easy  to  prove. 

After  the  verification  of  the  first  two  stages  we  obtain  the  conjunction  of  all 
pre-conditions  from  the  sequential  processes  as  the  pre-condition  of  the  complete 
program  and  the  conjunction  of  the  post-conditions  as  the  final  post-condition.  In 
our  example  this  leads  to  the  pre-condition  true  A  y  =  3  and  the  post-condition 
x  =  4  A  y  —  8  which  are  equivalent  to  the  required  conditions. 

3.2  Towards  compositionality 

In  this  section  we  discuss  how  a  compositional  proof  method  can  be  obtained  for 
programs  which  communicate  via  synchronous  message  passing.  First  the  cooperation 
test  from  Apt  et  al  (1980)  is  removed  by  not  allowing  implicit  assumptions  in  the 
post-conditions  of  i/o-statements.  The  local  proof  of  a  sequential  program  should  be 
valid  in  any  arbitrary  environment.  Since  this  would  weaken  the  method,  and  any 
valid  post-condition  should  be  provable,  we  use  a  history  variable  h  which  denotes 
the  communication  history  of  the  complete  program.  A  (communication)  history  is 
a  sequence  of  records  (c,$)  where  c  is  a  channel  name  and  d  a  value.  For  example, 
<(c5),  (b,  6),  (a,  8),  (b,  0))  is  a  history  expressing  four  communications:  first  one  via 
channel  c  with  value  5,  then  a  communication  via  b  with  value  6,  etc.  Let  <)  denote 
the  empty  sequence.  History  variable  h  does  not  appear  in  the  program,  but  it  is  updated 
implicitly  in  the  semantics  of  i/o-statements.  This  leads  to  the  following  valid  formulae. 

•  For  an  output  command  we  have,  for  example, 

{h  =  <(c, 5)}}b\6{h  =  <(c, 5),  (b, 6))}. 


1  In  the  full  method  of  Apt  et  al  (1980)  auxiliary  variables  and  a  global  invariant  are  used 
to  make  the  method  complete,  i.e.,  to  guarantee  that  any  valid  Hoare  triple  can  be  proved. 
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•  For  an  input  statement  c?x  we  can  only  express  in  the  post-condition  that  there 
exists  a  value  which  is  communicated  via  c  and  assigned  to  x.  For  instance, 

{ h  =  <> }clx{3v:h  =  <(c, v ))  Ax  =  ti}. 

Example  3.2.  To  prove  (true}  c!5||c?x{x  =  5},  we  use  the  Hoare  triples 

{h  =  (}}cl5{h  =  <(c,5)>}  and  {h  =  {}}clx{3v:h  =  <(c,u)>  Ax  =  u}. 

Suppose  the  pre-  and  post-conditions  after  parallel  composition  are  obtained  by 
simply  taking  the  conjunction  of,  respectively,  pre-  and  post-conditions  of  the 
sequential  programs.  Then 

[h  =  <> } c!5 1| c?x {At  =  <(c,  5)>  A  3 »:/i  =  <(c,»)>  Ax  =  ®}. 

Since  the  post-condition  implies  3v:v  —  5  A  x  =  v,  and  hence  x  =  5,  the  consequence 
rule  leads  to  {h  =  <)}c!5  ||c?x{x  =  5}.  By  a  so-called  substitution  rule  (not  given  in 
this  paper),  we  could  substitute  <)  for  h  in  the  pre-condition,  thus  obtaining 
pre-condition  true.  □ 

Example  3.2  suggests  the  following  rule: 

{Pi  1^1  {<?!  };  {P2  }  ^2  {^2  } 

{Pi  A  P2}$1  II  ^2(^1  A  q2} 

Example  3.3.  Consider  again  S1||S2,  where  Sx=a?x;  x:  =  x+l;  b\(x- 1-2)  and 
S2  =  a\y ;  fr?y;  y:  =  y- 1-2.  First  derive  the  following  Hoare  triples: 

{h  =  <>}a?x{3y1:/i  =  <(a,t>i)>  Ax  =  »!}, 

(3u1:/i  =  ((a,^))  Ax  =  ti1}x:  =  x+1  (3t \\h  =  ((a,^))  Ax  =  tq  -I- 1}, 
and 

{3vx:h  =  ((a,^))  A  x  =  vt  4-  l}h!(x  +  2) 

{3v1:h  =  ((a,Vx),  (Mi  +  3)>  A  x  =  vt  +  1}. 
By  two  applications  of  the  sequential  composition  rule  we  obtain 

{h  =  <)}a?x;  x:  =  x  +  1;  b\(x  4-  2) 

(3 vx:h  =  ((a,!^),  (Mi  +3))Ax  =  u1  +  1}. 

Similarly, 

{fc  =  <>A  y  =  3}a!y;  h?y;  y:  =  y  +  2{3v2:h  =  <(a,  3),  (M2)>  A  y  =  v2  +  2}. 

Then  the  parallel  composition  rule  above  leads  to 

{/i  =  ()Ay  =  3}S1  ||S2{3t>i:/i  =  ((a,Vx),  ( Mi  +3))Ax  =  »1  +  l  A' 

3v2:h  =  <(a, 3),  (M2)>  Ay  =  i)2  +  2}. 

The  post-condition  implies  3t?x ,£>2:1;!  =  3  A  v2  =  i>j  +  3  A  x  =  tq  +  1  A  y  =  u2  +  2,  which 
leads  to  x  =  4  A  y  =  8.  Thus,  by  the  consequence  rule.  { h  =  <>  A  y  =  3}Sj  || S2 
(x  =  4  A  y  =  8}. 

(Again  h  =  <)  in  the  pre-condition  can  be  removed  by  a  substitution  rule.)  □ 
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Although  this  works  nicely  for  two  processes,  the  next  example  shows  that  there 
is  a  problem  if  more  than  two  processes  are  involved. 

Example  3.4.  Consider  || S2  ||  <S3 ,  where  =  «!0;  b?x,  S2  =  aly;  c!(y+l),  and 
S3  =  c?z;  bl(z  +  i).  Similar  to  example  3.3,  we  could  first  prove 

{h  =  0}Sl{ql  =  3vy:h  =  <(a,0),  (b,  iq))  A  x  =  vx } 

and 

{h  =  <> } S2 {q2  =  3v2 :h  =  <{a,v2),  (c,v2  +  1)>  A  y  =  v2}. 


But  then  the  conjunction  of  q1  and  q2  implies  false  whereas  S1  ||  S2 1|  S3  terminates  and 
hence  does  not  satisfy  post-condition  false. 

The  problem  is  that  h  denotes  the  global  history  of  the  complete  program  -  e.g., 
consisting  of  three  processes  -  whereas  each  of  the  processes  in  isolation  can  only 
describe  the  history  on  its  own  channels.  A  possible  solution  is  to  give  each  process 
its  own  history  variable,  and  to  combine  these  local  history  variables  at  parallel 
composition.  This  is  done  in  Soundararajan  (1984a),  using  a  predicate  compat.  Zwiers 
(Zwiers  et  al  1984;  Zwiers  1989),  however,  shows  that  a  concise  and  simple  rule  for 
parallel  composition  can  be  formulated  if  each  process  uses  projections  of  global 
history  variable  h  onto  its  own  channels.  Such  a  projection  expresses  the  view  of  a 
particular  process  on  the  global  history.  Formally,  the  projection  of  h  onto  a  set  of 
channel  names  cset,  notation  hcset,  denotes  the  sequence  obtained  from  the  history 
denoted  by  h  by  removing  all  records  with  a  channel  name  not  in  cset.  For  instance, 
if  h  =  ((a,  0),  (c,  1),  (b,  3)>  then  h{c)  =  <(c,  1)>,  h{b  c]  =  ((c,  1),  {b,  3)>,  and  h[d]  =  <>. 
Henceforth  we  write  hc,  hbc,  and  hd  instead  of  h,^,  h^bc^,  and  h,d^,  respectively. 

In  the  rule  for  ||S2  we  require  that  the  post-condition  of  S,  only  refer  to  history 
h  via  projections  on  channels  occurring  in  Sh  for  i  =  1,2.  If,  moreover,  the  post-condition 
of  S,  only  refers  to  program  variables  of  S;,  then  the  following  rule  for  parallel 
composition  is  sound; 

Rule  3.4.  (Parallel  composition) 

{Pi}Si{gi}.  {MMgz} 

{Pi  APi}S1\\S2{q1  A  q2}' 

Observe  that  this  is  a  compositional  rule  because  a  Hoare  triple  for  Sj  j|  S2  can  be 
derived  without  knowing  the  internal  structure  of  and  S2.  To  obtain  a  valid  rule 
we  only  impose  a  simple  syntactic  requirement  on  assertions  and  processes  (the 
post-condition  of  a  process  should  only  refer  to  channels  and  variables  of  the  process 
itself).  This  simple  syntactic  check  replaces  the  cooperation  test  which  requires  proof 
outlines  for  the  complete  program  text  of  the  processes. 

Except  for  bottom-up  verification  such  a  compositional  rule  can  be  used  for 
top-down  development.  Therefore,  a  triple  {p}S{q}  is  considered  as  a  specification 
for  a  program  S.  Suppose  we  decide  to  implement  S  as  Sj  ||  S2 .  If  we  can  find  assertions 
Pi,  q{,  p2,  and  q2  that  satisfy  p^>p t  A  p2  and  q{  A  q2~>q,  and  moreover  certain 
syntactic  requirements  on  the  post-conditions  hold,  then  Sj  and  S2  can  be  implemented 
independently,  using  specifications  {Pi}Sl{ql}  and  {p2}S2{q2}- 

Since  in  this  compositional  framework  programs  can  be  considered  as  black  boxes, 
and  verification  is  done  on  the  basis  of  the  specifications  only,  we  can  allow  nested 
parallelism  in  programs  as  expressed  by  the  syntax  of  §  2. 
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4.  Extensions  of  Hoare  triples 

A  Hoare  triple  is  perfectly  suited  to  describe  the  observable  behaviour  of  a  sequential 
program  which  is  given  by  initial  and  final  states.  For  a  parallel  program  also  the 
communication  behaviour  on  its  external  channels  is  observable.  Hence  a  specification 
of  a  parallel  component  should  express  this  communication  interface.  Note,  however, 
that  a  specification  {p}  S  {q}  has  an  important  limitation:  it  only  specifies  the  behaviour 
of  S  if  S  terminates.  All  non-terminating  computations  of  S  satisfy  such  a  specification 
trivially.  Thus  the  post-condition  cannot  be  used  to  express  the  communication 
interface  of  non-terminating  programs.  Therefore,  a  Hoare  triple  is  extended  with  an 
invariant,  called  commitment  in  this  paper,  which  must  hold  throughout  the  computa¬ 
tion.  This  leads  to  a  formula  of  the  form  C:  {p}S{g}  where  commitment  C  describes 
the  communication  interface  of  S  during  its  execution.  The  success  of  such  formulae 
in  many  applications  is  based  on  a  simple  rule  for  parallel  composition  in  which, 
besides  conjunctions  for  pre-  and  post-conditions,  also  the  conjunction  of  commitments 
can  be  taken. 

Rule  4.1.  (Parallel  composition) 

^-'2  •  \P2  }  *^2  {*?2  } 

C1  A  C2:{p i  A  p2}Sl  ||S2{9i  A  q2 } 

provided,  for  i  =  1,2,  the  assertions  C;  and  qt  refer  to  h  only  via  projections  on  the 
channels  occurring  in  Sh  and  qt  only  refers  to  program  variables  of  St. 

In  this  formalism  the  influence  of  the  environment  on  the  communication  behaviour 
of  a  program  can  be  expressed  by  using  implications  in  the  commitment.  The  next 
example  indicates  that  for  so-called  reactive  systems  (Harel  &  Pnueli  1985)  which 
have  an  intensive  interaction  with  their  environment,  this  style  of  specification  often 
leads  to  proofs  using  inductive  arguments.  This  motivates  a  final  extension  of  the 
Hoare-style  formulae. 

In  the  following  examples,  seq^^seq-2  expresses  that  sequence  seqx  is  an  initial 
prefix  of  sequence  seq2.  For  a  history  variable  h,  a  set  of  channels  cset,  and  a  number 
i,  we  use  /icsc,[i]  =  (c,  d)  to  denote  that  the  record  (c,  9)  is  the  ith  element  of  the  sequence 
denoted  by  hcset.  We  assume  that  hcsel[f\  =(c, 3)  is  true  if  i  is  greater  than  the  length 

of  hcset. 

Example  4.1.  We  verify  two  reactive,  non-terminating,  processes  that  have  a  close 
interation.  Consider  =  c!  1;  *  [d?x -►  c!  (x  -I-  1)]  and  S2  =  ★  [F?y  ^  d!  (y  +  1)].  The  aim 
is  to  prove  that  Sj  jj  S2  satisfies  the  commitment  hcd  <(e,  1),  ( d ,  2),  (c,  3),  (d,  4),  (c,  5), . . .  ), 
that  is, 

Vi  l:/icd[2i  —  1]  =  (c,2i  —  1)  A  hcd[2i]  —  (d,2i). 

Define 

C i  =  /icd[l]  =(c,  1)  A  (Vi  ^  lVt>:/jC(J[2i]  =  (d, v)-^hcd[2i  +  1]  =(c,v+  1)). 

Then  for  Si  we  can  prove 

Ci :  {hcd  =  <> }  c!  1  ;*  [d?x  -*■  c!(x  +  1)]  {false}. 

Similarly,  for  S2  we  define 

C2  =  (Vi$s  lVa:/icd[2i  —  1]  =  (c, u) -►  hcd [2i]  —{d,v  +  1)). 
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Then  we  have 

C2:{hcd  =  0}*  Wy  ->d\(y  +  1)]  {false}. 

Since  the  required  syntactic  conditions  are  fulfilled,  we  can  apply  the  rule  for  parallel 
composition  which  leads  to 

C1AC2:{hcd  =  0}S1\\S2{false}. 

To  prove  that  Cx  A  C2  implies  the  required  commitment,  we  prove  by  induction  on 
i  that,  for  i  ^  1 ,  hcd  [2 i  —  1  ]  =  (c,  2i  —  1 )  A  hcd  [2i]  =  (d,  2 i). 

•  Basic  step.  For  i  =  1  we  have,  by  C1?  that  hcd[  1]  =  (c,  1)  and  thus  from  C2  we  obtain 
^c<*[2]  =  (d,  2). 

•  Induction  step.  Assume  hcd [2i  —  1]  =  (c,  2i  —  1)  A  hcd[ 2Q  =  {d,  2 i).  Then  hcd[2i]  = 

(d,2i)  implies,  by  Cl5  that  hcd[2i  +  1]  =  (c, 2i  +  1),  and  thus  hcd[2(i  +  1)  —  1]  = 
(c,2(i  +  1)-  1).  Using  C2  this  leads  to  hcd[2(i+  l)]=(d,2(i  +  1)).  □ 

This  example  illustrates  that  assumptions  about  the  environment  are  important  in 
the  specification  of  a  process,  and  it  indicates  that  mutual  assumptions  of  processes 
about  each  others  communication  interface  usually  leads  to  correctness  proofs  using 
inductive  reasoning.  Based  on  these  observations  we  present  an  extension  of  the 
correctness  formulae  in  which  a  process  can  be  specified  relative  to  explicit  assumptions 
about  its  environment,  and  an  inductive  relation  is  incorporated  in  the  specification. 
Therefore  the  specification  formula  is  extended  with  a  second  invariant,  called 
assumption,  which  expresses  assumptions  about  the  environment  and  by  which  we 
can  strengthen  post-condition  and  commitment.  This  leads  to  formulae  of  the  form 
{A,C):{p}S{q},  where 

A  is  an  assumption  describing  the  expected  behaviour  of  the  environment  of  S,  and 
C  is  a  commitment  which  is  guaranteed  by  process  S  itself,  as  long  as  the  environment 
does  not  violate  the  assumption. 

The  general  idea  is  that  assumption  and  commitment  reflect  the  communication 
interface  between  parallel  components  (and  hence  do  not  contain  program  variables), 
whereas  pre-  and  post-condition  facilitate  the  reasoning  at  sequential  composition. 

Example  4.2.  In  this  formalism  assumptions  about  the  values  sent  by  the  environment 
can  be  expressed  explicitly.  For  instance,  the  assertion  (ic=^<(c,  3))  can  be  used  as 
an  assumption: 

(/ic^ <(c, 3)),  true): {true} clx{x  =  3}. 

This  assumption  expresses  that  if  a  communication  along  c  takes  place  then  the 
environment  will  send  the  value  3.  The  next  formula  shows  that  it  can  be  used  for  a 
commitment  about  the  next  communication: 

{hc^((c,  3)>,  ha^((a,4)}:{true}c?x;  a!(x+  l){x  =  3}.  □ 

Assumption/commitment  based  reasoning  was  introduced  in  Misra  &  Chandy 
(1981).  A  proof  system  for  these  formulae  has  been  given  in  Zwiers  et  al  (1984).  In 
this  paper  we  discuss  the  proof  obligations  for  assumptions  and  commitments  in  the 
parallel  composition  rule.  Consider  the  parallel  composition  S t  ||S2,  and  suppose  we 
have  assumption-commitment  pairs  (A^Cj)  for  Sj  and  (A2,C2)  for  S2.  Which 
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conditions  have  to  be  verified  to  obtain  a  pair  (A,  C)  for  Sy  ||  S2?  Consider  assumption 
A 2  of  S2: 

•  A  2  may  contain  assumptions  about  joint  channels  of  Sy  and  S2  which  connect  these 
two  processes;  these  assumptions  must  be  justified  by  the  commitment  C1  of 

•  A 2  may  contain  assumptions  about  external  channels  of  S2.  These  assumptions 
are  maintained  in  the  new  network  assumption  A  for  Sy  ||S2. 

This  leads  to  the  following  proof  obligation:  A  A  C1-*A2.  Similarly,  A  A  C2->Al. 

To  obtain  a  sound  rule  with  these  implications,  the  meaning  of  a  formula  (A^C;): 
{Pi}Si{qi}  has  to  be  defined  carefully.  A  simple  implication  between  A,  and  C;  would 
with  the  implications  above  and  A  =  true  lead  to  circular  reasoning,  that  is, 
Ay  -+Cy  ->A2->C2->  Ay.  Therefore  in  defining  the  meaning  of  (AhCi):  {p^Sfqy}  we 
require  that  if  p;  holds  in  the  initial  state  then  (1)  C,  holds  initially,  and  (2)  C;  holds 
after  every  communication  provided  At  holds  after  all  preceding  communications. 
This  inductive  step  inside  the  meaning  of  formulae  is  sufficient  to  avoid  circularity 
(see  Misra  &  Chandy  1981,  Zwiers  et  al  1984). 

As  in  rule  4.1  we  can  take  the  conjunction  of  pre-conditions,  post-conditions,  and 
commitments,  provided,  for  i  =  1,2,  the  assertions  Ah  C,,  p;  and  q f  of  St  refer  only  to 
h  via  projections  on  the  channels  of  Sh  and  pt  and  qt  only  refer  to  program  variables 
of  S;.  (Program  variables  are  not  allowed  in  At  and  C,.)  With  these  constraints,  the 
following  rule  for  parallel  composition  is  valid. 

Rule  4.2.  (Parallel  compositon  A-C) 

04 1 ,  Cy):{py  }St  {qx },  (A2,C2):{p2 }S2{q2} 

A  A  C  y  — ►  A  2 ,  A  A  C2  — ►  A  j 
(A, Ci  A  C2):{py  Ap2}Sy\\S2{qy  A  q2} 

Example  4.3.  Consider  Sy  ||  S2  where  Sy  =  a?x;  x:  =  x  +  1;  b\(x  +  2)  and  S2  =  c2y;  aly; 
bly;  y:  =  y  +  2.  Then  for  Sy  and  S2  we  can  derive 

Ml  =  <(a,3)>,  c,  a ht^«b,6)y.{K„  =  <>}S,{x  =  4), 

and 

(A 2  =  /ic^  <(c,3)>  A  <(h,6)>,  C2  =  /ia=^  <(a, 3)}):{habc  =  <>}S2{y  =  8}. 

Since  Sy  and  S2  communicate  with  each  other  via  the  channels  a  and  b,  the  remaining 
assumption  for  Sy  ||S2  concerns  the  external  channel  c:  A^hc^((c,  3)>. 

Then  A  A  Cx  A2  and  A  A  C2->A1,  thus  the  parallel  composition  rule  leads  to 

(4,  Cj  A  C2)'.{habc  =  <>}S,  ||S2{x  =  4  A  y  =  8}.  □ 

Example  4.4.  Consider  again  the  two  reactive  processes  from  example  4.1: 

St  =  c!l;ik[rf?x->c!(x  +  1)]  and  S2  =  *\_c!y  —> dl(y  +  1)]. 

We  show  how  the  assumption/commitment  formalism  can  be  used  to  prove  for  Sy  ||  S2 
the  commitment  Vi  ^  1  :hcd[2i  —  1]  =  (c, 2/  —  1)  A  hcd\_2Q  =  ( d ,  2i). 

Define 

Ay  =  Vi  ^  1  :/icd[2i]  =  (d,  2  i) 
and 

Cy  =  Vi  ^  l:hcd[2i  -  1]  =  (c,  2 i  -  1). 
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Then  for  Sx  we  can  prove 

(i41,C1):{hcd  =  <>}c!l;  *[d?x->c!(x  +  1)2  {false}. 
Similarly,  for  S2  we  define 


A2  =  Vi>l:hcd[2i-\l=(c,2i-\), 

and 

C2  =  Vi  ^  1  :/icd[2i]  =  (d,  2 i). 

Then  we  have 

O2.  C2):{fiCJ  =  <>}.[c?j<- d<(y  +  imfalse). 

Let  A  =  true.  Then  A  A  C2->AX  and  A  A  Cx  ->  A2.  Since  the  other  conditions  on  the 
assertions  are  also  satisfied,  we  can  apply  the  rule  for  parallel  composition,  leading  to 

{true,  Cx  A  C2):{hcd  =  0}St  \\S2{false}. 

Clearly  Cx  A  C2  is  equivalent  to  the  required  commitment.  □ 

Observe  that  in  the  correctness  proof  for  the  two  reactive  processes  from  example  4.4 
there  is  no  explicit  inductive  argument.  The  requirement  inductive  reasoning  is 
performed  only  once  in  the  soundness  proof  of  the  parallel  composition  rule  (rule  4.2). 
In  this  respect  we  have  obtained  a  rule  for  parallel  composition  which  is  the  analogue 
of  Hoare’s  while  rule  for  sequential  programs  (Hoare  1969). 


5.  Compositionality  and  real-time 

In  this  chapter  we  adapt  the  compositional  Hoare-style  proof  systems  from  the 
previous  chapters  to  real-time.  We  describe  in  detail  a  compositional  method  to 
specify  and  verify  timing  constraints.  By  describing  the  details  of  a  particular 
compositional  proof  method,  we  illustrate  the  general  outline  of  such  a  description 
which  should  consist  of  the  following  points. 

1.  A  description  of  the  programming  language,  i.e.,  syntax  and  informal  semantics. 

2.  A  formal  semantics  of  the  programming  language. 

3.  The  definition  of  an  assertion  language  in  which  properties  of  programs  can  be 
expressed.  For  this  assertion  language  we  also  have  to  give  syntax,  informal 
meaning,  and  formal  interpretation. 

4.  The  definition  of  a  correctness  formula  that  relates  programs  and  assertions.  Using 
the  semantics  of  the  programming  language  and  the  interpretation  of  assertions, 
the  validity  of  such  a  correctness  formula  can  be  defined  formally. 

5.  A  proof  system  in  which,  by  rules  and  axioms,  correctness  formulae  can  be  derived 
formally. 

6.  The  proof  of  soundness  and  (relative)  completeness  of  the  proof  system:  show  that 
every  correctness  formula  that  can  be  derived  is  also  valid,  and  that  every  valid 
formula  can  be  derived  (assuming  that  valid  assertions  can  be  derived). 

As  an  example,  we  consider  in  this  section  a  compositional  proof  method  for 
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distributed  real-time  systems  based  on  Hooman  (1991b).  Concerning  the  points  above, 
we  describe: 

1.  A  real-time  programming  language  with  nested  parallelism,  communication  via 
synchronous  message  passing,  and  time-outs. 

2.  The  semantic  model  which  is  used  to  give  a  denotational  semantics  for  the 
programming  language;  the  meaning  of  a  program  is  given  by  a  set  of  models 
where  each  model  describes  a  possible  computation  of  the  program. 

3.  A  first-order  assertion  language  which  includes  primitives  to  specify  the  timing 
behaviour  of  programs. 

4.  A  correctness  formula  of  the  form  C:{p}S{q}. 

5.  A  compositional  proof  system  to  derive  these  extended  Hoare  triples. 

6.  The  proofs  of  soundness  and  (relative)  completeness  for  the  proof  system  given  in 
this  section  are  not  given  here.  The  reader  is  referred  to  Hooman  (1991b)  for  all 
details  about  these  proofs. 

The  syntax  and  informal  semantics  of  our  real-time  programming  language  are 
given  in  §5.1.  A  semantic  model  for  this  language  is  described  in  §5.2.  In  §5.3  we 
define  the  syntax  and  the  interpretation  of  assertion  language  and  correctness 
formulae.  A  compositional  proof  system  for  this  formalism  is  presented  in  §  5.4. 

5.1  Real-time  programming  language 

Our  real-time  programming  language  is  based  on  the  Occam-like  language  from  §2 
and  is  akin  to  real-time  versions  of  CSP  as  defined  in  Koymans  et  al  (1988)  and 
Huizing  et  al  (1987).  We  add  a  real-time  statement  delay  d  which  suspends  the  execution 
for  (at  least)  d  time  units.  This  statement  is  also  used  in  the  language  Ada  (Ada  1983) 
and  corresponds  to  a  wait  d  statement  (Koymans  et  al  1988;  Huizing  et  al  1987). 
Similar  to  a  delay-statement  in  select  construct  of  Ada,  such  a  delay-statement  is 
allowed  in  a  guard  of  a  guarded  command  to  enable  the  programming  of  time-outs. 

To  investigate  the  basic  real-time  framework  and  to  highlight  the  main  points,  no 
program  variables  are  used  -  we  consider  only  the  (real-time)  communication 
behaviour.  In  Hooman  (1991b)  we  show  that  this  framework  can  be  extended  to  a 
language  with  program  variables.  Processes  communicate  and  synchronize  by  message 
passing  via  unidirectional  channels,  each  connecting  two  processes.  Communication 
is  synchronous. 

5.1a  Syntax  and  informal  meaning:  Let  TIME  be  some  countable  ordered  time 
domain  and  oo  a  special  symbol,  oo  TIME.  The  syntax  of  our  programming  language 
is  given  in  table  2,  with  neN,  c,c1,...,cneCHAN ,  deTIME ,  and  d0e  TIME<j { oo}, 
d0  >  0. 

Table  2.  Syntax  programming  language. 

Statement  S  ::  =  skip  |  delay  d  |  c!  |  cl  \ 

S1;S2\G\*G\S1  H  S2 

Guarded  command  G  ::=  [  []? =  i  CP-  $i  D  delay  d0  -*■  S] 


Since  we  have  slightly  modified  the  syntax  of  our  programming  language,  we  briefly 
mention  the  informal  meaning  of  statements. 
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Atomic  statements 

•  skip  terminates  immediately. 

•  delay  d  suspends  execution  for  d  time  units. 

•  c!  is  used  to  send  a  signal  along  channel  c.  (In  this  chapter  we  do  not  consider  the 
value  transmitted.)  Since  we  are  assuming  synchronous  communication,  a  statement 
c\  is  suspended  until  the  receiving  process  executes  a  statement  cl. 

•  cl  is  used  to  receive  a  signal  along  channel  c.  An  input  statement  cl  is  suspended 
until  the  sending  process  executes  an  output  statement  c!. 

Compound  statements 

•  Sl;  S 2  indicates  sequential  composition  of  statements  Sj  and  S2. 

•  Guarded  command  [0"=  O delay  d-*S~\  is  executed  as  follows:  wait  at  most 

d  time  units  for  some  input  guard  c,?  to  become  enabled,  that  is,  until  communication 
can  actually  occur  along  one  of  the  ct  because  a  communication  partner  becomes 
available.  If  at  least  one  of  the  ^-communications  is  possible  (before  d  time  units 
have  elapsed),  one  of  these  communications  (non-deterministically  chosen)  is 
performed  and  thereafter  the  corresponding  S,  is  executed.  If  d  #  oo  and  no  guard 
is  enabled  within  d  time  units  after  the  start  of  the  execution  of  the  command,  then 
S  is  executed. 

Example  5.1.  This  construct  makes  it  possible  to  model  a  time-out,  i.e.,  to  restrict 
the  waiting  period  for  certain  communications.  Consider  the  guarded  command 
[c?x  ->•  Sj  Q  delay  5  -*■  S2  ];  if  there  is  no  partner  available  for  the  input  statement  within 
5  time  units  then  the  delay-alternative  is  taken  and  S2  is  executed.  □ 

•  +G  indicates  repeated  execution  of  guarded  command  G.  Since  we  do  not  consider 
boolean  guards  in  this  chapter,  execution  of  *G  never  terminates. 

•  ||  S2  indicates  parallel  execution  of  and  S2. 

We  often  use  [0"=iCi?->Si]  as  an  abbreviation  of  [Q"=1Ci?->  S,  Q  delay  oo-»S],  if 
n  >  0.  Let  DCHAN  be  the  set  of  channels  extended  with  directional  channels; 

DCHAN  =  CHANu{c\\ceCHAN}u{cl\ceCHAN}. 

DEFINITION  5.1  (Channels  occurring  in  statement) 

The  set  of  (directional)  channels  occurring  in  a  statement  S,  notation  dch(S),  is  defined 
as  the  smallest  subset  of  DCHAN  such  that  if  c  is  an  output  channel  of  S  then 
{c,  c!}  ^dch(S),  and  if  c  is  an  input  channel  of  S  then  {c,  cl)  £  dch{S ). 

5.1b  Syntactic  restrictions:  A  number  of  syntactic  constraints  are  imposed  upon 
statements  to  guarantee  that  a  channel  connects  exactly  two  processes.  With  the 
definition  of  dch(S)  above  we  can  express  these  restrictions  formally  as  follows: 

•  For  ;  S2  we  require  that,  for  all  ceCHAN,  c\edch(S j)  implies  cl$dch{S2),  and 
cledch{S j)  implies  c\£dch{S2). 

•  For  a  guarded  command  G  =  [□"=1ci?-+S,  D delay  d->S0]  we  require 
— for  all  ie{l,...,n}  that  c,!^dc/i(G),  and 

— for  all  i,je{0,  l,...,n},  i^j,  and  ceCHAN  that  dedch(Si)  implies  cl^dch(Sj), 
and  cledch(Sj)  implies  c\$dch(Sj). 

•  For  Sj  ||  S2  we  require  ddi(S1)ndc/i(S2)  £  CHAN. 
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5.1c  Basic  timing  assumptions:  The  precise  meaning  of  this  programming  language 
is  defined  by  a  denotational  semantics  which  describes  the  real-time  behaviour  of 
programs.  Such  real-time  semantics  requires  information  about  implementation 
details  from  which  one  usually  abstracts  in  non-real-time  models,  such  as  the  execution 
time  of  assignments  and  the  time  required  to  evaluate  boolean  tests.  Thus  we  have 
to  make  assumptions  about  the  execution  time  of  atomic  statements  and  about  the 
extra  time  needed  to  execute  compound  constructs,  i.e.,  how  the  execution  time  of 
compound  constructs  can  be  obtained  from  the  timing  of  the  components.  In  our 
proof  systems  the  correctness  of  a  program  with  respect  to  a  specification,  which  may 
include  timing  constraints,  is  verified  relative  to  these  assumptions. 

In  general  we  will  have  bounds  on  the  execution  time.  Here  we  assume,  for  simplicity, 
that  there  is  no  overhead  for  compound  statements  and  that  a  delay  d  statement  takes 
exactly  d  time  units.  Furthermore  we  assume  given  a  constant  Kc  >  0  such  that  each 
communication,  i.e.,  without  the  waiting  period,  takes  Kc  time  units. 

The  most  important  assumption  involves  parallel  composition.  To  determine  the 
execution  time  of  parallel  programs  we  need  information  about  the  progress  of  actions, 
representing  the  allocation  of  processes  on  processors.  In  general,  we  have  to  make 
an  assumption  about  the  execution  model  of  parallel  processes.  In  this  paper  we 
consider  the  maximal  parallelism  model  to  represent  the  situation  that  every  process 
has  its  own  processor.  Hence,  a  process  never  waits  with  the  execution  of  a  local, 
non-communication,  statement.  An  input  or  output  statement  can  cause  the  process 
to  wait,  but  only  when  no  communication  partner  is  available;  as  soon  as  both 
partners  are  available  the  communication  must  take  place.  Thus  the  maximal 
parallelism  model  implies  a  minimal  waiting  period. 

In  Hooman  (1991a)  we  have  generalized  this  maximal  parallelism  assumption  to 
multiprogramming  where  several  processes  can  be  executed  on  a  single  processor  and 
scheduling  is  based  on  priorities  which  can  be  assigned  to  statements  in  the  program. 

5.2  Semantic  model 

Our  formal  model  of  real-time  communication  behaviour  consists  of  a  mapping  from 
points  of  time  to  sets  of  channel  names,  indicating  the  channels  along  which  messages 
are  being  transmitted  at  any  given  time.  In  addition  to  the  names  of  the  channels 
along  which  a  communication  takes  place,  the  model  includes  information  about 
those  processes  waiting  to  send  or  waiting  to  receive  messages  on  their  incident 
channels  at  any  given  time.  Using  this  information,  the  formalism  enforces  minimal 
waiting  in  our  maximal  parallelism  model  by  requiring  that  no  pair  of  processes  is 
ever  simultaneously  waiting  to  send  and  waiting  to  receive,  respectively,  on  a  shared 
channel. 

We  express  the  timing  behaviour  of  a  program  from  the  viewpoint  of  an  external 
observer  with  his  own  clock  (as  done  in  Reed  &  Roscoe  1986,  Koymans  et  al  1988). 
Thus,  although  parallel  components  of  a  system  might  have  their  own,  physical,  local 
clock,  the  observable  behaviour  of  a  system  is  described  in  terms  of  a  single,  conceptual, 
global  clock.  Since  this  global  notion  of  time  is  not  incorporated  in  the  distributed 
system  itself,  it  does  not  impose  any  synchronization  upon  processes.  Then  we  define 
a  real-time  computation  of  a  program  by  means  of  a  function  which  assigns  to  a 
point  of  time  a  set  of  records,  representing  the  events  that  are  taking  place  at  that  point. 

In  this  paper  we  use  a  time  domain  TIME  which  is  dense,  i.e.,  between  every  two 
points  of  time  there  exists  an  intermediate  point.  With  such  a  dense  time  domain  a 
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communication  can  be  represented  by  an  interval  of  time  points  during  which  we 
record  this  communication,  and  we  can  easily  model  communications  that  overlap 
in  time  or  that  are  arbitrarily  close  to  each  other  in  time.  Having  dense  time  is  also 
suitable  for  the  description  of  reactive  systems  which  interact  with  an  environment 
that  has  a  time-continuous  nature  (see,  e.g.  Koymans  1990).  Furthermore,  we  argue 
that  in  a  compositional  framework  it  is  inconvenient  to  use  discrete  time.  Recall  that 
compositionality  allows  us  to  design  a  process  in  isolation  according  to  its  specification. 
With  a  discrete  notion  of  time  a  smallest  time  unit  has  to  be  chosen  in  this  specification, 
and  then  when  two  independently  developed  processes  with  different  time  units  are 
combined,  a  new  basic  time  unit  must  be  defined  and  the  specifications  of  the  processes 
have  to  be  modified  accordingly.  Finally,  a  dense  time  domain  allows  the  refinement 
of  a  single  event  into  a  sequence  of  sub-events,  such  as  the  implementation  of  a 
single  synchronous  communication  by  a  sequence  of  asynchronous  communications 
according  to  some  protocol.  An  extensive  discussion  about  the  nature  of  time  can  be 
found  in  Joseph  &  Goswami  (1989).  In  this  paper  we  use  the  non-negative  rationals 
as  our  (dense)  time  domain: 

TIME  =  { 

where  21  is  the  set  of  rational  numbers. 

For  notational  convenience,  a  special  value  oo  is  used  with  the  following  properties: 
oo  TIME ,  for  all  ze  TIME :  z  <  oo,  and  for  all  ze  TIMEkj  {oo}:  t+oo  =  oo+t=oo, 
oo— t  —  oo,  1X00  =  00  X1  =  oo,  max(oo,z)  =  max(z,  cc)  =  oo,  min(oo,z)  =  min(oo,z)  =  z, 
and  min  </>  =  a o.  For  a  point  z0eTIME ,  a  left-closed  right-open  interval  [0,to)  is 
defined  as  {t|ts  TIME  AO^t<t0}. 

DEFINITION  5.2  (Model) 

Let  t0e  TIME u{oo}. 

A  model  a  (of  a  real-time  computation)  is  a  mapping  o:  [0,  t0)-*  / i(DCHAN ). 
DEFINITION  5.3  (Length  of  a  model) 

For  a  model  a  with  domain  [0,  t0)  the  length  of  cr,  denoted  by  |  cr |,  is  defined  as  |crj  =  z0. 

Thus,  for  all  zeTIME,  with  z  <  \o\,  we  have  o(z)  ^  DCHAN.  Informally,  a  model 
<j  represents  the  communication  behaviour  at  each  point  in  time  during  an  execution 
of  a  program.  If  \o\  =  oo  then  o  represents  a  non-terminating  computation,  and 
if  |  cr  |  <  oo  then  it  represents  a  computation  that  terminates  at  time  |  cr |.  For  a  point 
of  time  r,  r  <  | cr |,  and  a  channel  name  ceCHAN,  we  have  three  possible  elements 
c,  cl  and  c?  for  cr(t)  with  the  following  meaning: 

•  ceo(z)  if  a  communication  takes  place  along  channel  c  at  time  t; 

•  cleo(z)  if  a  process  is  waiting  to  send  along  channel  c  at  time  z; 

•  cleo(z)  if  a  process  is  waiting  to  receive  along  channel  c  at  time  t. 

DEFINITION  5.4  (Concatenation  of  models) 

Define  the  concatenation  of  two  models  ot  and  cr2,  denoted  by  o1o2,  as 
\Gl°  l\  ~  l^ll  +  I  cr2  I 
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for  all  r  <  |  <Ti  | 
for  all  | cr1 1  ^  t  <  |  (Tj  |  +  |  o2  f 

Note  that,  for  all  models  ctj  and  a2,  if  | ctj  |  =  oo  then  o 2  =  o1. 

A  compositional  semantics  for  our  programming  language  is  defined  in  Hooman 
(1991b)  using  the  computational  model  as  described  above.  The  meaning  of  a  program 
S,  denoted  by  J/{S),  is  a  set  of  models  representing  the  possible  computations  of  S 
starting  at  time  0.  J/(S)  is  defined  by  induction  on  the  structure  of  S  according  to 
the  grammar  in  table  2. 

5.3  Specifications 

We  modify  the  Hoare-style  framework  of  the  previous  chapters  by  extending  the 
first-order  assertion  language  with  primitives  to  specify  the  timing  behaviour  of 
programs.  As  already  explained,  by  means  of  Hoare  triples  we  can  only  specify  partial 
correctness.  Hence,  we  add  a  third  assertion,  called  commitment,  to  specify  real-time 
properties  of  terminating  and  non-terminating  computations.  In  contrast  with  the 
previous  chapters,  the  aim  is  to  specify,  besides  safety  properties  (which  can  be  falsified 
in  infinite  time),  also  liveness  properties.  Therefore  the  commitment  will  not  be  an 
invariant  which  holds  at  any  point  during  a  computation  (as  in  §4),  but  it  should 
hold  for  complete,  possibly  infinite,  computations. 

5.3a  Modification  of  Hoare  triples  to  real-time:  To  extend  a  Hoare  triple  { p}S{q } 
to  real-time,  a  special  variable  time  is  introduced.  Consider,  for  instance  the  formula 
{time  —  3}  delay  2 {time  —  5}.  In  the  pre-condition  the  variable  time  specifies  the 
starting  time  of  the  program,  whereas  in  the  post-condition  time  denotes  the 
termination  time.  Furthermore,  to  specify  the  timed  communication  behaviour  of 
programs,  the  assertion  language  includes  a  primitive  comm  via  c  at  exp  to  express 
that  a  communication  along  channel  c  takes  place  at  time  exp.  As  in  the  semantics, 
primitives  are  required  to  express  that  a  process  is  waiting  to  communicate.  Here  we 
use  wait  to  c\  at  exp  to  denote  that  a  process  is  waiting  to  send  a  message  along 
channel  c  at  time  exp,  and  wait  to  cl  at  exp  to  denote  that  a  process  is  waiting  to 
receive  along  channel  c  at  exp.  As  usual  in  Hoare-style  formalisms,  logical  variables 
are  used  to  relate  pre-  and  post-condition.  It  this  chapter  we  have  logical  variables 
ranging  over  T IM E  u  { oo },  and  quantification  over  these  variables.  F or  instance,  with 
logical  variable  t,  the  specification  {time  =  t}S{t  +  4  <  time  <t  +  7}  expresses  that  if 
S  terminates  then  it  takes  between  4  and  7  time  units. 

Recall  that  a  formula  {p}S{q}  can  only  express  the  behaviour  of  terminating 
computations,  and  hence  such  a  specification  is  trivially  satisfied  by  non-terminating 
programs.  Therefore  we  extend  a  Hoare  triple  {p}S{q}  with  a  third  assertion,  called 
a  commitment,  which  expresses  the  real-time  communication  behaviour,  of  all 
executions  of  S,  including  the  non-terminating  ones.  This  leads  to  a  correctness  formula 
of  the  form  C:  {p}  S  {q}.  In  general,  commitment  C  reflects  the  real-time  communication 
interface  between  parallel  components,  whereas  the  pre-  and  post-condition  facilitate 
the  reasoning  for  sequential  composition  and  iteration. 

Finally,  we  argue  that  termination  should  be  expressible  in  commitments.  Consider 
the  statements  St=cl  and  S2  =  [c?  -+  skip 0  cl -** [delay  1  ->skip]].  Then  the  programs 


and 
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Sx;  d\  and  S2;  d\,  which  satisfy  the  same  Hoare  triples,  can  be  distinguished  in  our 
extended  framework  by  the  commitment  Vt0:  ( comm  via  c  at  t0->3tx  ^  t0:  [ wait  to  dl 
at  tx\/  comm  via  d  at  ] ).  Since  we  aim  at  a  compositional  proof  system,  the  distinction 
between  Sx;  d\  and  S2;  dl  implies  that  and  S2  must  also  be  distinguishable.  Hence 
we  have  to  express  termination  in  the  commitment.  This  can  be  done  conveniently, 
without  introducing  new  primitives,  by  allowing  the  special  variable  time  to  occur  in 
commitments.  Observe  that  the  commitment  can  be  seen  as  an  extension  of  the 
post-condition  to  non-terminating  computations.  Hence,  by  interpreting  time  similar 
to  post-conditions,  time  in  commitments  expresses  the  termination  time  of  terminating 
computations.  For  non-terminating  computations  we  use  the  special  variable  oo  and 
such  computations  satisfy  the  commitment  time  —  oo.  In  the  example  above,  Sx  and 
S2  can  be  distinguished  by  using  the  commitment  Vt0:  ( comm  via  c  at  t0->time  <  oo). 

5.3b  Assertion  language:  Let  TVAR  be  a  set  of  logical  variables  ranging  over 
TIME  u  {oo}.  Thy  syntax  of  the  assertion  language  is  given  in  table  3,  where  te  TVAR, 
ceCHAN,  and  re  TIME  u{oo}.  Let  dch(p)  denote  the  set  of  all  c,  cl,  or  c?  occurring 
in  assertion  p. 

Table  3.  Syntax  of  the  assertion  language. 


Expression  exp  ::  =  x  \  t  \  time  \  expx  -I-  exp2  |  expx  x  exp2 
Assertion  P  '■'■  =  comm  via  c  at  exp  \  wait  to  cl  at  exp  |  wait  to  cl  at  exp 
expx  =  exp2 1  expx  <  exp2 \  expe^V  \ 

-ip|pi  V  p2 1 3t:p 


To  interpret  logical  variables  we  use  a  logical  variable  environment  y:TVAR-> 
TIME  u  { oo },  i.e.,  a  mapping  which  assigns  a  value  from  T IME  u  { oo }  to  each  logical 
variable.  The  value  of  a  variable  t  in  an  environment  y  is  denoted  by  y(t).  The 
variant  of  an  environment  y  with  respect  to  a  logical  variable  t  and  a  value 
re  TIMEu{oo},  denoted  by  (y : ti — »-t),  is  defined  as  follows.  For  any  logical  variable  tx, 


(y  :th->r)(ti)  = 


if  t  =£  tx 
if  t  =  tx 


Then  we  formally  define  when  an  assertion  p  holds  in  an  environment  y  and  a  model 
a  as  defined  in  §  5.2.  The  special  variable  time  is  interpreted  as  the  length  of  a(i.e.,  |er|). 
If  expression  exp  yields  a  value  x  <  |<r|  then  the  interpretation  of  the  primitives  comm 
via  c  at  exp,  wait  to  cl  at  exp,  and  wait  to  c?  at  exp  is  straightforward  using  a(z).  But 
if  exp  yields  a  value  greater  than  \  a\  then  we  must  be  more  careful  with  the  meaning 
of  these  primitives.  (Consider,  for  instance,  comm  via  c  at  ( time  +  3).  If  such  an  assertion 
would  hold  in  a  model,  which  is  intuitively  strange,  then  this  would  lead  to  problems 
in  the  proof  system.  For  instance,  for  a  formula  C:{comm  via  c  at{time  +  3)}S{g}  the 
information  from  the  pre-condition  should  not  be  used  in  C  and  q.  In  general,  a 
pre-condition  should  express  the  behaviour  before  the  start  of  a  program  and  it  should 
not  restrict  the  behaviour  of  the  program  at  points  of  time  after  the  starting  time. 
Thus  we  aim  at  an  interpretation  in  which  comm  via  c  at(time  -I-  3)  never  holds  in  a 
model.  Note  that  C:{~ \comm  via  c  at  (time  +  3)}  S {q}  leads  to  the  same  problems, 
and  hence  also  — i  comm  via  c  at(time  -I-  3)  should  not  hold  in  any  model. 

A  possible  solution  is  to  be  careful  with  negations  and  to  apply  it  only  to  primitive 
assertions.  Here  we  choose  an  alternative  approach;  to  achieve  a  compositional 
definition  of  negation  we  use  a  three-valued  interpretation.  This  means  that  the  value 
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of  an  assertion  p  in  an  environment  y  and  a  model  o,  denoted  [p]y<r,  is  true ,  false,  or 
1.  To  define  negation  and  disjunction  of  assertions,  logical  operators  NOT3  and  0R3 
for  these  three  values  are  defined  by  the  truth  tables  in  table  4.  These  operators,  which 
were  introduced  in  Kleene  (1952),  are  the  strongest  monotonic  extensions  of  the 
classical  (two-valued)  operators. 


Table  4.  Three-valued  negation  and  disjunction. 


p 

NOT3p 

or3 

true  false 

1 

true 

false 

true 

true  true 

true 

false 

true 

false 

true  false 

1 

1 

1 

1 

true  _L 

1 

First  we  define  the  value  of  expression  exp  in  a  model  o  and  an  environment  y, 
denoted  v(exp)(y,  cj,  yielding  a  value  from  TIME u{oo}. 


•  v(t)(y,  a)  =  t, 

•  v(t)(yi<j)  =  y{t), 

•  v(time)(y,  o)  =  \o\, 

•  v{exp1  +  exp2)(y,  cr)  =  v{exp3  )(y,  a)  +  v(exp2)(y,  a), 

•  v(expL  x  exp2)(y,  a)  =  v(exp1)(y,<T)  x  v{exp2){y,a). 

Next  we  define  inductively  [p]y<7  as  an  element  of  {true,  false ,  J_}. 

{true,  if  v(exp)(y,cr)  <  |oj  and  ceo(exp)(y,  cj) 
false,  if  v(exp){y,o)  <  |oj  and  c$o(v(exp){y,o)), 
1,  if  v(exp)(y,o)  ^  |oj 


{true,  if  v(exp)(y,  oj  <  |  cr  |  and  c\e<j(v(exp){y,  a)) 
false,  if  v{exp){y,a)<  |<r|  and  c!^cr(v(exp)(y,<r)), 
1,  if  v(exp){y,a)^  \a\ 

{true,  if  v(exp)(y,  a)  <  |oj  and  c?ecr(v(exp)(y,  cr)) 
false,  if  v(exp)(y,  a)  <\a\  and  c7<tcr(v(exp)(y,a)), 
1,  if  v(exp){y,cr)  ^\a\ 


lexpl=exp2}y<r  = 


true,  if  v(expY)(y,  oj  =  v(exp2){y,a) 
false,  if  v{expi){y,a)  #  v{exp2)(y,e), 


•  [expi<exp2]y<r  = 


true,  if  v(cxp  j  )(y,  oj  <  v(exp2)(y,<r) 
false,  if  v(cxpj  )(y,  o)  ^  v{expl  )(y,  a). 


[cxpeNjyo  = 


true,  if  v(exp)(y,  ojeN 
false,  if  v(exp)(y,oj<£l\J’ 


•  [[ — '  p]  yo-  =  ATO  7^3  [p]]  y< 7, 

•  fPi  Vp2]y<T  =  [p1]y<TOR3[p2]y(T, 

{true,  if  there  exists  a  te  TIME u{oo}  with  [p](y:ti— >t)o  =  true 
false,  if  for  all  xeTIMEu  {oo},  [pl(y:f-»t)a  =  false 
_L,  otherwise. 
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The  conventional  abbreviations  are  used,  such  as  px  A  p2  =  ~ i  ('i  px  V  — i  p2), 
py  ->p2  =  — i  pj  V  p2,  and  Vr :p  =  — i  : — i  p.  Subsequently  we  say  that  p  holds  in  y  and 
a  if  lp}yo  =  true.  Furthermore,  we  frequently  use[p]ya  as  an  abbreviation  of 
[p}y(T  =  true. 

Returning  to  our  example,  observe  that  the  interpretation  is  such  that,  for  any  y 
and  a,  [ comm  via  c  at  ( time  +  3)]ycr  =  1  and[~ i  comm  via  c  at  ( time  +  3)]y<x  =  1.  Thus 
neither  comm  via  c  at  ( time  +  3)  nor  ~ i  comm  via  c  at  ( time  +  3)  holds  in  y  and  er.  In 
general  this  interpretation  facilitates  sequential  reasoning,  since  [p]ycr  implies  that  p 
does  not  express  any  constraint  on  points  of  time  after  |<r|.  This  is  expressed  formally 
in  lemma  5.5  below:  if  assertion  p  holds  in  ax  then  p,  with  time  replaced  by  Icql, 
holds  in  any  arbitrary  extension  of  <j1,  i.e.,  in  a1a2  for  any  a2 ■  (A  proof  for  this  lemma 
can  be  found  in  Hooman  1991b.) 

Lemma  5.5.  For  all  y  and  o1:  [pjyoq  iff  (for  all  o2,  [p[k1|/n’we]]y<T1flr2). 

We  use  the  conventional  relations  between  expressions,  such  as 

•  exp  j  ^  exp  2  =  (expy  <  exp2)  V  (expx  =  exp2), 

•  expx  ^  exp2  =  (exp2  <  expy)  V  (expx  =  exp2), 

•  expy  ^  exp2  ^  exp3  =  (expx  ^  exp2)  A  (exp2  ^  exp3),  etc. 

Relativized  quantifiers  are  defined  as  usual,  for  instance, 

•  Vr,  t0  ^  t  <  time.p  =  Vr:t0  <  f  <  time^p. 

•  3f,  t0  ^  t  <  time :  p  =  3t:t0^t<  time  A  p. 

Furthermore,  the  following  abbreviations  are  frequently  used: 

•  true  =  0  =  0, 

•  false  -  i  true, 

•  wait  to  c\  during  [t0,t j)  =  Vf2,  t0  ^  t2  <  tx:  wait  to  cl  at  t2, 

•  comm  via  c  during  [f  0  ^  i )  =  Vt2,  to^t2<  tx:comm  via  c  at  t2, 

•  no  comm  via  c  during  [r0,  ty)  =  Vf2,  t0  ^  t2  <  ty i  comm  via  c  at  t2, 

•  wait  to  c\  at  t0  until  comm  at  tx  = 

wait  to  c!  during  \_t0,tx)  A  comm  via  c  during  [t1,t1  +  Kc), 

•  wait  to  c!  at  t0  until  comm  =  3 ty  ^  t0:  wait  to  cl  at  t0  until  comm  at  tx . 

Let  cset  be  a  finite  subset  of  DCHAN.  Then 

•  no  cset  during\_t0,  tx)  =  Vt,  t0  ^  t  <  tx :  A  c!ecscI  — i  wait  to  cl  at  t  A 

A  c2ecset  i  wait  to  c?  at  t  A  A  cecse~ i  comm  via  c  at  t. 

The  abbreviations  above  are  also  used  with  c?  instead  of  cl,  and  with  other  intervals 
such  as  (t0,t i)  and  (f0,oo)  instead  of  the  interval  [t0>L)-  It  is  easy  to  extend  these 
definitions  for  general  expressions  instead  of  t0  or  tx. 

Observe  that  logical  variables  range  over  TIMEu  {oo},  and  thus 

wait  to  cl  at  t0  until  comm,  i.e.,  3tj  t0 :  wait  to  cl  at  t0  until  comm  at  tx  is 
equivalent  to  [ wait  to  c!  at  t0  until  comm  at  oo]  V  [3tj,  t0  ^  tx  <  cc:wait 
to  cl  at  t0  until  comm  at  tx]. 

Since  comm  via  c  during  [oo,  oo  +  Kc)<-+true,  this  is  equivalent  to 
[ wait  to  cl  during[t0,  oo)]  V 

[3  t1,t0^t1  <  oo:  wait  to  cl  during  [r0- ^  i )  A  comm  via  c  during  \_tx,tx  +  Kj]. 
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DEFINITION  5.6.  (Validity  assertions) 

An  assertion  p  is  valid,  denoted  N  p,  iff  p  holds  in  any  environment  y  and  any  model 
o,  i.e.,  [pjycx  for  all  y  and  o. 

Next  we  define  when  a  correctness  formula  C:  {p}S{q}  is  valid.  Informally,  if  p 
holds  in  an  initial  model  d,  and  o  represents  a  computation  of  S,  then  C  holds  in  the 
concatenation  do  of  these  models,  and  if  a  terminates  then  q  holds  in  do.  This  leads 
to  the  following  formal  definition  (recall  that,  e.g.,  [p]y<7  is  an  abbreviation  of 
Ipjyd  =  true). 

DEFINITION  5.7.  (Validity  of  a  correctness  formula) 

For  a  program  S  and  assertions  C,  p  and  q,  a  correctness  formula  C:{p}S{q }  is  valid, 
denoted  by  \zC:{p}S{q},  iff  for  any  y  and  any  well-formed  model  d  with  |a|  <  oo: 

if  [pj yd  then  for  all  oeJf(S):[C]y(do),  and  if  \d  \  <  oo  then  [ qJy(do). 

Example  5.2.  We  show  that  Vtime  —  t  +  d:{time  —  t} delay  d{time  =  t  +  d}.  Consider 
an  environment  y  and  a  model  d  with  |a|  <  oo.  Assume  [ time  =  t\yd.  Then  \d\  —  y(t), 
i.e.,  the  starting  time  is  the  value  of  t  in  environment  y.  For  oeJf  (delay  d)  we  have 
\o\  =  d.  Then  [ time  =  t  +  djydo,  since  \do  \  =  |<r|  +  |<r|  =  y{t)  -f  d.  □ 

Observe  that  the  definition  of  validity  of  a  correctness  formula  requires  that  the 
assertions  hold  for  each  environment  y,  and  hence  free  logical  variables  in  a 
specification  are  implicitly  universally  quantified. 

Example  5.3.  As  an  example  of  a  liveness  specification,  consider  the  formula 
C  A  time  —  oo:  {time  =  0}  *[c?-> skip]  {false}, 

with 

C  =  (Vt0  <  oo3tl5  >  t0:  comm  via  c  at  tx]  V 
(3 10  <  oo:  wait  to  cl  during[t0,  oo)). 

This  commitment  expresses  that  the  program  either  communicates  infinitely  often,  or 
it  eventually  waits  forever.  Observe  that  this  is  not  a  safety  property  since  it  cannot 
be  falsified  in  finite  time.  After  presenting  the  rule  for  the  iteration  construct  we  show 
that  this  valid  formula  is  also  derivable,  as  it  should  be  in  a  complete  proof  system. 

□ 


5.4  Proof  system 

In  this  section  we  give  a  compositional  proof  system  for  our  correctness  formulae. 
First  we  formulate  rules  and  axioms  that  are  generally  applicable  to  any  statement. 
Next  we  axiomatize  the  programming  language  by  formulating  rules  and  axioms  for 
all  atomic  statements  and  compounds  programming  constructs.  Le>t  hC:{p}S{q} 
denote  that  the  formula  C:{p}S{q}  can  be  derived  in  this  proof  system. 

General  part 

We  start  with  an  axiom  expressing  the  well-formedness  properties  of  a  computation. 
Let  cset  be  a  finite  subset  of  DCHAN. 
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Axiom  5.8.  (Well-formedness) 

Well  Formcset:{true}S{Well  Formcset} 

where 

Well  Formcset  =  Vf  <  time:  M  Wcset(t)  A  Exclcset(t), 

with 

M  Wcset(t )  =  A  ~ i  (waif  to  cl  at  t  A  wait  to  c\  at  t), 

{c!,c?}  £  cset 

Exclcset{t)  =  1  A  ~ i  (wait  to  cl  at  t  A  comm  via  c  at  t)  J  A 

\{c,c!}  £  csef  / 

I  A  “l  ( wait  to  cl  at  t  A  comm  via  c  at  t)  I . 

\{c,c?}£csef  J 

We  give  a  few  lemmas  concerning  the  use  of  Well  Formcset.  These  lemmas  will  be 
used  in  the  chapter  6  where  we  illustrate  our  formalism  by  an  example  of  a  watchdog 
timer. 


Lemma  5.9.  For  all  f0,fls 

wait  to  cl  duringitoJ^  A  Well  Form^  c,  f?, -»•  no{cl,c}  during  (t0,t i). 


Lemma  5.10.  For  all  t0, 

wait  to  cl  at  t0  until  comm  A  Well  Form,c  c!  c?>  -► 

Vtx,  t0  <  tx  <  t0  +  Kc:~i  wait  to  cl  at  fx . 

The  proof  system  contains  a  consequence  rule  which  is  an  extension  of  the  classical 
consequence  rule  for  Hoare  triples  (see  rule  3.3  in  §  3).  Note  that  by  definition  5.7  of 
a  valid  correctness  formula,  pre-conditions  are  interpreted  in  a  model  <x  with  |a|  <  oo. 
Hence  any  pre-condition  can  be  strengthened  by  adding  time  <  oo  to  express  that  the 
starting  time  if  finite. 


Rule  5AI.  (Consequence) 

Co-{Po}S{q0},  p  A  time  <  oo  ->p0,  C0->C,  q0^q 
C:{p}S{q] 

Observe  that  h  false:  {time  =  oo  }S  {false},  for  any  program  S.  To  deduce  this  formula, 
we  first  derive  false:  {false}  S  {false} .  This  can  be  done  by  means  of  the  initial  invariance 
rule  below.  Then  we  can  use  the  consequence  rule,  since  time  =  oo  A  time  <  oo  -*■  false. 

Next  we  give  two  axioms  to  deduce  invariance  properties.  The  first  axiom  expresses 
that  the  pre-condition,  except  for  the  variable  time,  remains  valid  during  the  execution 
of  a  program. 


Axiom  5.12.  (Initial  invariance) 
p-{p}s{p} 

provided  time  does  not  occur  in  p. 
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The  soundness  of  this  axiom  is  based  on  lemma  5.5  which  guarantees  that  if  [p] \yo 
and  p  does  not  contain  time,  then  [pjymx,  for  any  model  o. 

The  channel  invariance  axiom  below  expresses  that  during  the  execution  of  a 
program  S  no  activity  takes  place  on  channels  not  occurring  in  S.  Let  cset  be  a  finite 
subset  of  DCHAN. 

Axiom  5.13.  (Channel  invariance) 

no  cset  during  \_t 0,  time): {time  —  t0} S {no  cset  during\t0,time )} 
provided  cset  ndch(S)  =  0. 

Our  proof  system  contains  the  following  rules  for  conjunction,  quantification,  and 
substitution. 

Rule  5.14.  (Conjunction) 

Cc{Pi}S{qi},  C2:{p2}S{q2} 

C 1  A  C2'-{Pi  A  p2}S{ql  A  q2}' 

Rule  5.15.  (Quantification) 

C:{3t:p}S{«} 

provided  t  does  not  occur  in  C  and  q. 

Rule  5.16.  (Substitution) 

C:{P}SM _ 

Clexp/ty.{plexp/(]}S{qlexp/t]} 

provided  time  does  not  occur  in  expression  exp. 

The  following  rule  can  be  used  to  transform  a  correctness  formula  with  pre-condition 
time  — 10  into  a  formula  with  an  arbitrary  pre-condition  and  starting  time.  This  is  a 
derived  rule,  that  is,  the  rule  can  derived  from  the  other  rules  and  axioms  in  the  proof 
system  (as  we  prove  below). 

Derived  rule  5.17.  (Adaptation) 

C:{time  =  r0}S{<j} 

p[exp/time~\  A  C[exp/t0{\:{p[exp/time']  A  time  =  exp) 

S{p\_exp/time']  A  q[_exp/t0{\} 

provided  time  does  not  occur  in  expression  exp. 

Proof.  Assume  h  C:  {time  =  t0}S {q},  and  suppose  time  does  not  occur  in  expression 
exp.  By  the  substitution  rule,  replacing  t0  by  exp,  we  obtain 


b  C[exp/t0]:{time  =  exp}S{q[exp/t0]j 
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Since  time  does  not  occur  in  exp,  the  initial  invariance  rule  leads  to 
I-  p  [exp /time'] :  {p  [exp /time] }  S  {p  [exp/ time] } 

Then,  by  the  consequence  rule, 

p[exp/time]  A  C[exp/t0]:{p[exp/time]  A  time  =  exp} 

S{p [exp /time]  A  q[exp/t0 ]}  □ 

Program  part 

The  rules  and  axioms  for  atomic  statements  will  be  given  with  pre-condition  time  =  t0; 
with  the  adaptation  rule  one  can  easily  obtain  any  arbitrary  pre-condition. 

Axiom  5.18.  (Skip) 

time  =  t0:{time  =  r0}  skip  [time  =  t0} 

Axiom  5.19.  (Delay) 

time  =  t0  +  d:  { time  =  t0 }  delay  d[time  =  t0  +  d) 

Example  5.4.  We  can  derive 

(time  =  5  A  comm  via  c  at  1  ):{time  =  2  A  comm  via  c  at  1} delay  3  {time  —  5} 
as  follows.  By  the  delay  axiom, 

time  =  t0  +  3 :{time  =  t0} delay  3 {time  =  t0  +  3}. 

Using  the  adaptation  rule,  with  p  =  comm  via  c  at  1  and  exp  —  2,  we  obtain 
( time  =  2  +  3  A  comm  via  c  at  1): 

{time  =  2  A  comm  via  c  at  1} delay  3  {time  =  2  +  3  A  comm  via  c  at  1}. 
Finally,  the  consequence  rule  leads  to 

( time  =  5  A  comm  via  c  at  1  ):{time  =  2  A  comm  via  c  at  1}  delay  3  {time  =  5}. 

□ 

To  formulate  a  rule  for  a  send  statement  cl,  observe  that  the  post-condition  can 
characterize  terminating  computations  consisting  of  a  waiting  period  (during  which 
no  communication  partner  is  available)  followed  by  an  interval  during  which  the  actual 
communication  takes  place.  In  addition,  the  commitment  can  characterize  non¬ 
terminating  computations  in  which  the  i/o-statement  waits  forever  to  communicate. 
Observe  that  3 t^t0:  wait  to  cl  at  t0  until  comm  at  t  A  time  =  t  +  Kc  implies  that  either 
t  =  oo  and  wait  to  cl  during  [r0,  oo)  A  time  =  oo,  or  there  exists  a  t,  t0  ^  t  <  oo  such 
that  wait  to  cl  during [t0,  t)  A  comm  via  c  during [t,  t  +  Kc)  A  time  =  t  +  Kc.  This  leads 
to  the  following  rule: 
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Rule  5.20.  (Send) 

3 1  ^  t0:wait  to  c\  at  t0  until  comm  at  t  A  time  —  t  +  Kc^C 
C:{time  =  t0}c\{C  A  time  <  oo} 

Similar  to  the  send  rule,  we  have  the  following  rule  for  a  receive  statement. 


Rule  5.21.  (Receive) 

3 1  ^  t0:wait  to  c?  at  t0  until  comm  at  t  A  time  =  t  +  Kc~*  C 
C:{time  —  r0}c?{C  A  time  <  co} 

The  inference  rule  for  sequential  composition  is  an  extension  of  the  classical  rule  for 
Hoare  triples.  To  explain  the  commitment  of  Sy;  S2,  observe  that  a  computation  of 
S2  is  either  a  non-terminating  computation  of  S1  or  a  terminated  computation 
of  Sj  extended  with  a  computation  of  S2.  The  commitment  of  S2  expresses  the 
non-terminating  computations  of  Si  by  using  the  commitment  of  S1  with  time  =  oo. 
Terminating  computations  of  S1  are  characterized  in  the  post-conditions  of  S1  which 
is  also  the  pre-condition  of  S2.  Then  these  computations  are  extended  by  S2  and 
described  in  the  commitment  of  S2 . 

Rule  5.22.  (Sequential  composition) 

Ci-.{p)S,<r],  C2:{r}S2{q} 

{C1  A  time  =  oo)  V  C2:{p}Sl;S2{q] 

Example  5.5.  Consider  the  program  cl;  d\.  Define 


Then 


^-'nonterm  —  wa it  to  c?  during  [0,  oo) j  and  C]erm  =  wait  to  cl  at  0  until  comm2 

at  tj. 

(Chonterm  A  time  =  oo)  V  (3tj  <  00 :C}erm  A  time  =  t1  +  Kc ): 

{time  =  0}c?{3t1  <  oo : C\erm  A  time  =  t1+  Kc}. 


For  d\,  define  C2  =  wait  to  d!  at  tx  +  Kc  until  comm.  Then  we  can  derive 


(3tj  <  oo: Cfrm  A  C2):{3t!  <  co:Cffm  A  time  =  tl  +  Kc}d\{true}. 

Observe  that  the  terminating  behaviour  of  cl  is  characterized  by  its  post-condition, 
thus  by  the  pre-condition  of  d\,  and  hence  can  be  included  in  the  commitment  of  d\. 
Now  the  sequential  composition  rule  leads  to 


(Chonterm  A  time  =  co)  V  (3 tx  <  oo :Cherm  A  C 2): {time  =  0} cl;  d\  {true}.  □ 

Given  the  rules  for  the  basic  statements,  it  is  often  easier  to  use  the  following  derived 
rule: 


Derived  rule  5.23.  (Sequential  composition  adaptation) 

_ C1:{p}S1{r},  C2:{time  =  t]S2{q} _ 

(Cj  A  time  =  oo)  V  (3 t:r[t/time]  A  C2):{p) S  1;S2{3r:r[t/time]  A  q } 
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Proof.  Assume 

>~C1:{p}Sl{r},  (1) 

I-  C2'.{time  =  t}S2{q}-  (2) 

Since  r-»3t:r[t/time]  A  time  =  t,  (1)  leads  by  the  consequence  rule  to 

hC1:{p}S1  {3t:r[t/time~]  A  time  =  t}.  (3) 

By  (2)  and  the  adaptation  rule: 

b r[t/time~\  A  C2:{r[t/time']  A  time  =  t}S2  {r[£/£ime]  A  q}. 

The  consequence  rule  leads  to 

b(3 t:r[t/fime]  A  C2):{r[t/time']  A  time  =  t}S2{3t:r[t/time']  A  q }. 

Then,  using  the  quantification  rule,  we  obtain 

b(3t:r[t/fime]  A  C2):{3t:r[t/timc]  A  time  =  £}  S2{3f  :r[t/time]  A  g}.  (4) 

From  (3)  and  (4),  by  the  sequential  composition  rule, 

b (Cj  A  time  =  oo)  V  (3t:r[t/time]  A  C2):{p}S1;S2{3t:r[t/time}\  A  q}.  □ 

Example  5.6.  Consider  the  program  c?;  d\.  We  prove 

(3fx  ^  t0:wait  to  c?  at  t0  until  comm  at  t j  A  wait  to  d\  at  tl  +  Kc  until  comm): 
{time  —  f0}c? ;  d!  {true} 

Let  C !  =  3tx  ^  t0:wait  to  c?  at  £0  until  comm  at  tx  A  time  =  tj  +  Kc,  then  by  the  receive 
rule  we  can  derive 

Cj^^time  —  £0}c?{C1  A  time  <  oo}. 

Let  C2  =  wait  to  d\  at  t  until  comm,  then  from  the  send  rule  we  obtain 
C2 :  {time  =  t}  d\  {true}. 

By  the  derived  sequential  composition  rule  we  can  now  derive 

(Cx  A  time  =  oo)  V  (3t:(C!  A  time  <  oo)[t/time]  A  C2): 

{time  —  t0}c?;  d!{3t:(Cj  A  time  <  oo )[t/time]  A  true}. 

Observe  that  the  commitment  [Cx  A  time  =  oo)  V  (St^Cj  A  time  <  oo )[t/fime]  A  C2) 
implies 

[wait  to  cl  at  t0  until  comm  at  oo  A  time  =  oo]  V 

[3t3tj  ^  t0:wait  to  cl  at  t0  until  comm  at  tl  A  t  =  tj  +  Kc  A  t  <  oo  A 

wait  to  d\  at  t  until  comm], 

and  thus 

[wait  to  cl  at  t0  until  comm  at  oo  A  time  =  oo]  V 

Pti.to  ^  tj  <  oo :  wait  to  cl  at  t0  until  comm  at  t{  A  wait  to  d!  at  tj  +  Kc  until  comm]. 
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Since  b  wait  to  dl  at  oo  until  comm,  the  consequence  rule  leads  to 

(3 tj  ^  t0:wait  to  c?  at  t0  until  comm  at  tx  A  wait  to  d\  at  tx  4-  Kc  until  comm): 

{time  =  t0} c?;  dl  {true}  □ 

Consider  a  guarded  command  G  =  [O^iC^-^S.  Qdelay  Define 

•  wait  in  G  during\_t0,t)  =  A"=1  wait  to  c,?  during  [f0,t)  A 

no(d.ch(G)  —  during[t0,  t) 

•  comm  Cj  in  G  from  t  =  comm  via  C;  during  [t,  t  +  Kc)  A 

no(dch(G )  —  (c  J) during  [t,  t  +  Kc )  A  time  =  t  +  Kc 

First  we  give  a  rule  for  the  case  that  d  =  oo,  thus  for  G  =  [  □  •  =  t  cf!  -*■  S,].  This  statement 
either  waits  forever  to  perform  one  of  the  cf?  communications  because  none  of  the 
partners  is  available,  or  it  eventually  communicates  via  one  of  the  c,?  and  then  executes 
the  corresponding  statement  S;. 

Rule  5.24.  (Guarded  command  without  delay) 

wait  in  G  during  [t0,  oo)  A  time  —  oo  -*  Cnonterm 
3 1,  t0  ^  t  <  oo:  wait  in  G  during  [t0,  t)  A  comm  c(-  in  G  from  t-^ph 
for  all  i  =  1, . . . ,  n 

Ci:{pi}Si{qi},  for  all  i=  1  ,...,n _ 

C  nonterm  V  V  ?=  j  C; :  { time  =  t0  }  [[]?=  i  Cj?  -►  SJ  {  V  U !  qt 

Next  consider  G  =  [  Q”=  x  ct ?  -*•  S',  0  delay  d  -*■  S’]  with  d  #  oo . 

Rule  5.25.  (Guarded  command  with  delay) 

3r,  t0  ^t  <  t0  +  d:wait  in  G  during  [t0,  t]  A 

comm  cf  in  G  from  t  ->  ph  for  all  i  =  1 
for  all  i=  l,...,n 

C:{wait  in  G  during  [t0,  t0  +  d)  A  time  —  t0  +  d}S{g} 

V  ?= !  C,  V  C:  {time  =  t0  }[□?=,  cf?  -  St  Q  delay  q} 

provided  d  #  oo. 

The  rule  for  the  iteration  construct  does  not  contain  any  explicit  well-foundedness 
argument,  although  we  deal  with  liveness  properties.  The  main  principle  is  that  liveness 
properties  can  be  derived  from  real-time  safety  properties,  and  these  properties  can 
be  proved  by  means  of  an  invariant  (see  example  5.7), 

Rule  5.26.  (Iteration) 

C:{C}G{C}. . 

(Vft  <  oo3t2  >  t1:C[t2/time])->Cnonlerm 
Cno„term  A  time  =  go  :{C}*G  {false) 

where  tx  and  t2  are  fresh  logical  variables. 
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Example  5.7.  Consider  the  formula  from  example  5.3,  expressing  a  liveness  property 
for  program  *[c?->skip],  Let  Cx  =  Vt3  <  oo3t4  >  t3:comm  via  c  at  t4  and 

C2  =  3t3  <  oo :  wait  to  cl  during  [t3,  oo). 

To  prove  (Ct  V  C2)  A  time  =  oo  '.{time  =  0}  *[c?-»  skip]  {false},  we  apply  the  iteration 
rule  with  Cnonterm  =  C3  V  C2  and  C  =  (Vt3  <  time3tA  >  t3:  comm  via  c  at  t4)  V  (3r3  <  time: 
wait  to  cl  during  [r3,  oo)). 

Observe  that  C  expresses  that  Ct  V  C2  holds  up  to  the  termination  time.  We  show 
that  the  two  conditions  of  the  iteration  rule  are  fulfilled: 

1.  We  prove  C:{C}[c?->skip]{C}  as  follows. 

By  the  rule  for  guarded  command  without  delay  we  obtain 

C:{time  =  r0}[c?->skip]{C}, 

with 

C  —  (Vr5,  t0^t5<  time3t6  >  t5:comm  via  c  at  t6)V 
(3t5,  t0^t5  <  time :  wait  to  cl  during[t5,  oo)). 

From  the  adaptation  rule,  with  p  =  C  and  exp  =  t0, 

C[t0/time]  A  C:{C[t0/time']  A  time  =  t0} [c?^ skip] {C\t0/time~]  A  C}. 

Since  C[t0/time']  A  C-+C,  the  consequence  rule  leads  to 

C:{C[tJtime~]  A  time  =  t0}[c?->skip]{C}. 

By  the  quantification  rule, 

C:{3r0  :C[t0ltime~\  A  time  —  t0}[c?-+skip]{C}. 

Since  C ^>3t0:C[t0/time']  A  time  =  t0,  the  consequence  rule  leads  to 

C:{C}[c?^skip]{C}. 

2.  We  have  (Vt2  <  oo3t2  >  tl:C[t2/time'])-+Cnonterm,  since 

(Vr3  <  oo3t2  >  tl:C[t2/time'] )  =  (Vfj  <  oo3t2  >  tt: 

[Vt3  <  f23t4  >  t3:comm  via  c  at  t4]  V  [3t3  <  t2:  wait  to  cl  during  [t3,  oo)])-> 
([Vt3  <  oo3r4  >  t3:comm  via  c  at  t4]  V  [3r3  <  oo  .wait  to  cl  during[t3,  oo)])  = 
(Ci  V  C2)  =  Cnonterm. 

Now  by  the  iteration  rule  we  obtain  (Q  V  C2)  A  time  —  oo :{C}*[cl -* skip]  {false}. 
Since  logical  time-variables  such  as  t3  and  r4  range  over  nonnegative  values,  time  =  0 
implies  Vf3  <  time3tA  >  t3:  comm  via  c  at  r4,  and  hence  time  —  0-*C.  Thus,  by  the 
consequence  rule,  (Ct  V  C2)  A  time  =  oo:{ time  —  0}* [c?-+  skip] {false}.  □ 

Consider  the  parallel  composition  of  statements  St  and  S2.  For  the  pre-conditions 
we  simply  take  the  conjunction.  For  the  post-condition  q  of  ||  S2  we  would  also 
prefer  to  take  the  conjunction  of  the  post-conditions  qx  and  q2  of,  respectively  and 
S 2 ,  but  there  a  small  problem  has  to  be  solved.  Observe  that,  for  i  —  1,2,  the  special 
variable  time  in  post-condition  q{  of  denotes  the  termination  time  of  S,.  Since,  in 
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general,  the  termination  times  of  S1  and  S2  will  be  different  (and  then  qi  A  q2  could 
imply  false,  see  example  5.8),  we  substitute  a  logical  variable  f,  for  time  in  qt.  Then 
the  termination  time  of  Sl  ||  S2,  expressed  by  time  in  its  post-condition,  is  the  maximum 
of  and  t2.  Furthermore  we  add  two  predicates  to  express  that  process  5,  does  not 
perform  any  action  between  £f  and  time.  A  similar  construction  is  used  for  the 
commitments.  This  leads  to  the  following  rule: 

Rule  5.27.  (Parallel  composition) 

i=  1,2 

3t1,t2:time  —  max(t1,t2)  A  A  f=  1Ci\_ti/time']  f\no  dchiS^duringlti^ime)-*  C 
3 £1?  t2:rime  =  max(£1,£2)  A  A  ?=lqi[ti/time']  A  no  dchiSf  during [£,-,  time)-+q 

C:{pl  A  p2}Si  || S2{q} 

provided  £x  and  t2  are  fresh  logical  variables,  and  dch^^qf  ^  dchlSf,  for  ie{l,2}. 

Example  5.8.  To  illustrate  the  problem  with  the  termination  times  at  parallel 
composition,  consider  the  following  two  (valid)  formulae: 

time  =  5:  {time  =  0} delay  5  {time  =  5}, 

and 

time  =  7 :{time  =  0} delay  7  {time  =  7}. 

Then  for  delay  5 1|  delay  7  we  cannot  take  the  conjunction  of  commitments  and 
post-conditions,  but  by  the  rule  above  we  obtain  the  commitment  and  post-condition 
time  =  7  because  (3 t1,t2:time  =  max(£!,£2)  A  tx  —  5  A  t2  =  l)-+(time  =  7).  □ 

If  time  does  not  occur  in  commitments  and  post-conditions  of  the  components  Sl 
and  S2  then  we  can  derive  from  rule  5.27  the  following  simple  rule: 

Derived  rule  5.28.  (Simple  parallel  composition) 

C-i  '■  {Pi  }^i  {dx },  C2m-{p2}  S2{q2} 

Ci  A  C2:{p1  A  p2}S1\\S2{q1  Aq2} 

provided  dc^C^qf  £  dchiSf,  for  ie{l,2},  and  time  does  not  occur  in  Cl5  C2,  qx,  and 
<h- 


6.  Example  -  watchdog  timer 

The  formalism  from  §  5  is  illustrated  by  an  example  of  a  watchdog  timer.  Consider 
the  network  pictured  in  figure  1.  Process  IF  is  a  “watchdog”  process:  its  job  is  to 
ensure  that  processes  are  functioning  properly.  We  abstract  from  the  task 

that  has  to  be  performed  by  Ph  but  we  assume  that  P,  is  functioning  correctly  iff  it 
is  ready  to  send  (or  sending)  a  reset  signal  on  channel  re(  to  W  at  least  once  every 
v{  time  units,  for  some  constant  v{.  So  long  as  all  processes  P,  are  ready  to  send  a 
reset  signal  in  time,  watchdog  timer  W  communicates  on  each  ret  at  least  once  every 
vt  time  units  and  then  it  does  not  communicate  on  channel  al.  As  soon  as  W  has  to 
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Figure  1.  Watchdog  timer 
network. 


wait  for  a  reset  signal  on  a  particular  rek  during  vk  time  units,  then  it  is  ready  to  send 
(or  sending)  an  alarm  message  on  channel  al  within,  say,  K  time  units. 

In  this  section  we  First  give  a  formal  specification  for  process  W.  Then,  given 
specifications  for  the  P,,  we  prove  that  Pl  || . . .  ||  Pn  ||  IF  is  ready  to  send  (or  sending)  on 
channel  al  iff  one  of  the  P(  is  not  functioning  correctly.  This  is  verified  using  our  proof 
system  without  knowing  the  implementations  of  P1,...Pn  and  W.  To  demonstrate 
program  design  from  a  specification,  W  is  implemented  as  a  parallel  composition 
Wx  !| . . .  ||  Wn  ||  A,  and  we  derive  (he  specification  of  W  using  specifications  for  and 
A.  Next  and  A  are,  independently,  implemented,  and  we  prove  that  these  programs 
satisfy  the  corresponding  specifications. 

6.1  Specification  of  the  watchdog  timer 

We  give  a  formal  specification  for  the  watchdog  timer  W  and  derive  properties  from 
it,  using  certain  specifications  for  the  processes  P;.  In  the  specification  of  W  we  express 
that  if  there  is  a  waiting  period  of  vk  time  units  to  receive  input  via  rek  then,  for  some 
constant  K,  W  starts  waiting  to  send  on  channel  al  within  K  until  the  actual 
communication  takes  place.  Furthermore,  W  tries  to  communicate  via  channel  al  at 
a  certain  point  of  time  only  if,  for  some  k,  there  was  a  previous  period  of  at  least  vk 
time  units  during  which  W  is  waiting  to  receive  input  via  rek.  Let 

C”  =  Vt0  <  oo:  wait  to  ref.  during  ( t0 ,  t0  +  vk)~* 

(3 1  ^  t0  +  vk  +  K.wait  to  all  at  t  until  comm), 

CY  =  Vtj  <  oo :wait  to  all  at  t^until  comm-* 

(3/c3t2  ^  tl:wait  to  ref  during  (t2,t2  +  vk)). 

Then  we  specify  W  by  A  C^ftime  =  0}  W{true}. 

We  prove  that  W  tries  to  send  a  message  via  al  iff  there  is  an  error  in  one  of  the 
processes  Pt.  Therefore  we  assume  given  a  specification  for  P,  in  which  we  use  a 
predicate  error ;  representing  some  erroneous  behaviour  of  P,.  Thus  assume  that,  for 
all  i,  we  have  CPi:{time  =  0}P,{truc},  where 

CPi  =  errori<->(3t0  <  oo:  no  {ref  ref  during  {t0,t0  +  ii;)). 

This  asserts  that  there  is  an  error  in  P,  iff  there  exists  a  period  of  vt  time  units  during 
which  P;  is  not  communicating  via  ret  and  not  waiting  to  communicate  via  re{.  Given 
our  specifications  for  P1 ,. . . ,  P„  and  W,  we  try  to  prove  that  Pj  || . . .  ||  P„  ||  W  satisfies 
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the  commitment  ( 3k:errork)<-+(3t  <  cc.wait  to  all  at  t  until  comm).  Applying  the  simple 
parallel  composition  rule  n  times  we  obtain 

CJT  A  Cf  A  Ani=iCPi:{time  =  0}P1\\...\\PJW{true}. 

By  the  well-formedness  axiom  and  the  consequence  rule  we  can  derive 


Well  Form{„t'„kUnhm_1'''tHy{true}Pl  || . . .  ||  Pn  ||  W(true). 

Using  the  conjunction  rule  we  obtain  the  following  commitment: 

c?  a  cr  A  A  U,CP<  A  Well  Form,,,, 

First  we  prove  that  this  commitment  implies 

(3t  <  oo :wait  to  all  at  t  until  comm) -* (3k: error k). 

3t  <  oo  :wait  to  all  at  t  until  comm 
=>  {C^}  3t  <  oo  3k3t2  ^  t'.wait  to  rek3  during  (t2,t2  3-  vk) 

=> {calculus}  3k3t2  <  oo:wait  to  rek?  during  (t2,t2  3-  vk) 
=s>(lemma  5.9}  3k3t2  <  oo'.no{rekl,rek}  during  (t2,t2  +  vk) 

=>{CPk]  (3  k:errork) 


Next  we  try  to  prove  (3k:errork)-*(3t  <  cc.wait  to  all  at  t  until  comm). 

From  3 k:errork  we  obtain,  by  CPk,  3k3t0  <  cc.no {rekl,rek}  during  (t0,t0  3-  vk),  and 
thus  3k3t0  <  cc.no  comm  via  rek  during  (t0,t0  +  vk).  With  the  current  specification  of 
W,  however,  nothing  can  be  derived  from  this.  The  specification  of  W  only  expresses 
how  W  should  behave  if  it  does  something  on  any  of  the  channels.  But  then  W  need 
not  do  anything;  even  the  simple  program  skip  would  satisfy  its  specification. 
Therefore  we  modify  the  specification  for  W  as  follows:  C'f  A  C^:{time  =  0}  W{true], 
with 


CY  =  <  cc.wait  to  all  at  tr  until  comm -> 

3k3t2  ^ti'.wait  to  rekl  during  (t2,t2  +  vk), 

CY  =  Vt3  <  co '.no  comm  via  rek  during  (t3,t3  -I-  uk)-+ 

3t4  ^  t3  +  vk  3-  K.wait  to  all  at  t4  until  comm. 

Note  that  C%  follows  from  C-Y,  because  wait  to  rek ?  during  ( t0,t0  +  vk )  implies  by 
lemma  5.9,  no  {rekl,rek}  during  (t0,t0  +  vk),  and  hence  no  comm  via  rek  during 
(t0,t0  3-  vk).  Now  the  proof  proceeds  as  follows,  for  all  k, 

(3  k:errork) 

=>{CPi}  3k3t0  <  oo : no {rekl,rek}  during  (t0,t0  3-  vk) 

=> {definition}  3k3t0<co'.no  comm  via rek  during  [t0,t0  3-  vk) 

=>{CY]  3k3t0  <  oo3t4  ^  t0  3-  vk  3-  K.wait  to  all  at  t4  until  comm 

=>  {calculus}  3t4  <  cc.wait  to  all  at  t4  until  comm. 
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Figure  2.  Implementation  of  the  watchdog  timer. 


6.2  Implementing  the  watchdog  timer 

Next  we  design  a  program  implementing  watchdog  process  W  that  satisfies  the 
required  specification.  Since  W  has  to  watch  all  processes  simultaneously, 

our  first  design  step  is  to  implement  IF  as  a  parallel  composition,  W  =  Wx  || . . .  ||  Wn  ||  A. 

Process  Wt  is  a  watchdog  for  Ph  and  it  signals  process  A  via  channel  at  as  soon  as 
there  is  no  communication  on  reL  for  at  least  v{  time  units.  Process  A  waits  for  a 
signal  on  any  of  the  a-  s;  after  receipt  of  a  signal  it  tries  to  send  a  message  on  al  (see 
figure  2).  We  give  specifications  for  and  A  and  prove  that  they  are  sufficient  to 
derive  the  specification  of  W.  The  specification  for  W{  expresses  that  tries  to 
communicate  via  a,-  only  if  it  has  been  waiting  to  communicate  via  ret  during  a  period 
of  v{  time  units.  On  the  other  hand,  if  there  is  a  period  of  vt  time  units  during 
which  no  communication  via  re,  occurs,  then  will  try  to  communicate  via  at  within 
a  certain  time  bound  K,.  Define 

Cf'  =  Vtj  <  oo  :wait  to  a,!  at  tx  until  comm^* 

3 12  ^  tx:wait  to  re,?  during  (t2 ,  t2  +  vi)> 

C?‘  =  Vfj  <  oo :no  comm  via  re,-  during  (t3,t3  +  u;)-> 

3f4  ^  t3  +  Vi  +  Kuwait  to  a,!  at  t4  until  comm. 

Then  W,  is  specified  by  C*‘  A  C^i:{time  —  0}  Wi{true}. 

The  specification  for  A  asserts  that  it  tries  to  send  a  message  via  al  only  if  there  was 
a  preceding  communication  via  one  of  the  ak.  If  A  is  not  waiting  to  communicate 
via  one  of  the  ak  at  a  certain  point  of  time,  then  within,  say,  KA  time  units  it  will 
wait  to  communicate  via  al  until  the  actual  communication  can  be  performed.  Define 

C?  =  Vr x  <  oo :wait  to  all  at  tx  until  comm-* 

3k3t2  ^  '.comm  via  ak  during  [f2,t2  +  Kc), 

C2  =  Vt3  <  oo:~i  wait  to  akl  at  t3  -* 

3 14  <  t3  +  KA:wait  to  all  at  t4  until  comm. 

Then  Cf  A  C2  '.{time  =  0}  A  {true}. 

We  show  that  W,L  || . . .  ||  W„  ||  A  satisfies  the  specification  of  W  (using  the  specifications 
of  Wx,...,Wn,  and  A  only).  By  the  repeated  application  of  the  simple  parallel 
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composition  rule,  we  obtain  the  conjunction  of  the  commitments  of  the  processes: 
A"=i(C^  A  Cff  A  Cf  A  C2.  By  the  well-formedness  axiom  and  the  conjunction 
rule  we  can  add  Well  Form,a  ala„k  =  !  leading  to  the  following  commitment: 
A?..(CT  A  Cf)AC<  A  a  A “Well 
This  implies  CA  as  follows,  for  all  tx  <  oo, 


wait  to  all  at  tx  until  comm 

=>  {Cf  }  3/c3t2  <  tx:comm  via  ak  during  [t2,  t2  +  ^c) 

=>  {definition}  3/c3t2  ^  t1:wait  to  akl  at  t2  until  comm 

=>  {Cfk}  3fc3t2  ^  tx  3t3  ^  t2:wait  to  rekl  during  (t3,  t3  +  vk). 

=>{t3  <  t2  ^  tj }  3/c3t3  ^  ty'.wait  to  ref!  during  (t3,  t3  +  t;k). 

Next  we  prove  Cf'.  For  all  t3  <  oo, 

no  comm  via  rek  during  (t3,t3  +  vk) 

=>{Clfk}  3 t4  ^  t3  +  uk  +  Kuwait  to  akl  at  t4  until  comm 

=>  {lemma  5.10}  3t4  ^  t3  +  uk  +  KkVt5,  t4  <  t5  <  t4  +  K,.:- 1  watt  to  ak?  at  t5 

=>  {C|}  3t4  ^  t3  +  t>k  +  KkVt5,  t4  <  t5  <  f4  +  Kc3t6  <  t5  +  Ka: 

wait  to  all  at  t6  until  comm 

=>  {calculus}  3t4  ^  f3  +  +  Kk3t6  ^  f4  +  KA:wait  to  all  at  t6  until  comm 

=> {calculus}  3t6  =5  t3  +  uk  +  Kk  +  KA:wait  to  all  at  t6  until  comm. 

Hence  the  specification  of  IF  can  be  derived  provided  Kk  +  KA  ^  K,  for  all  /c. 


6.3  F/na/  implementations 


Finally,  we  give  implementations  for  the  processes  A  and  Wh  and  we  show  that  these 
programs  meet  the  required  specifications. 


Implementation  of  A:  First  we  show  that  A  can  be  implemented  as  [□"=  1af?->  a/!]. 


We  have  to  prove 

Ci  A  C2:{time  =  0}[[]?=1a;?->a/!]{true}. 

Define,  for  A  =  [□"=  !«;?-►  a/!]  and  ie{l,...,n}, 

C\  =  wait  in  A  during  [t0,  t)  A  comm  via  at  during  [t,  t  +  Kc),  and 
Cf  =  wait  to  all  at  t  +  Kc  until  comm. 

We  apply  the  rule  for  guarded  command  without  delay  using 

Ct  =  3t,  t0  ^  t  <  oo  :Cf  A  Cf ,  Cnon(erm  =  watt  in  A  during  [t0,  oo), 

Pi  =  3t,  t0  ^  t  <  oo  :C*  A  time  =  t  +  Kc,  and  qt  =  true.  Then 

1.  wait  in  A  during  [t0,  oo)  A  time  =  oo  ->  C„on(erm. 
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2.  3 t,t0  ^  t  <  oo :  wait  in  A  during  [t0,r)  A 

comm  via  at  during  [t,t  +  Kc)  A  time  =  t  +  Kc->pf,  for  ie{\,...,n}. 

3.  C,:{p,}  all  (qj,  for  ie{l,...,n}  can  be  derived  as  follows. 

By  the  send  axiom  and  the  consequence  rule, 

wait  to  all  at  tl  until  comm:  {time  =  tx}  all  {true}. 

Applying  the  adaptation  rule,  with  exp  =  tx  +  Kc,  we  obtain 

(Pf Cfi  +  KJtime ]  A  wait  to  all  at  tx  +  Kc  until  comm): 

{Pilti  +  KJtime]  A  time  —  tx  +  Kc}  all  {true}. 

Since  (pf [f x  3-  KJtime]  A  wait  to  all  at  tx  +  Kc  until  comm)-* 

(3r,  t0  ^  t  <  oo:Cj  A  tx  +  Kc  =  t  +  Kc  A  wait  to  all  at  tx  +  Kc  until  comm)-* 

(3r,  t0  <  t  <  oo :C}  A  Cf ),  the  consequence  rule  leads  to 

Q:{P;lTi  +  KJtime]  A  time  =  tx  +  Kc}  all  {true}. 

By  the  substitution  rule,  using  exp  =  0, 

CnontermW/to]  V  V ?=  i Ct [0/ro] : {time  =  0}  =  [□"=  i a,? -*•  all] {true}. 

Since  pi-*3t1:pi[t1  +  KJtime]  A  time  =  tx  4-  Kc,  by  the  consequence  rule, 

CiJpi}  all  {qt}. 

Then  by  the  rule  for  guarded  command  without  delay  we  obtain 
Cnonterm  V  V  ni=  x  Ct: {time  =  t0 } [QjL  x  a?.  -*  all]  {true}. 

By  the  substitution  rule,  using  exp  =  0, 

CnontermLO/to]  V  V  ”i=  j  C, [0/ro] : {time  =  0} [□?=  t  aj.  -»•  fl/!] {true}. 

We  prove  that  this  commitment  implies  Cf  A  Cf . 

•  First  prove  Cf  =  Vtx  <  oo :wait  to  all  at  tx  until  comm-* 

3k3t2  ^  ty'.comm  via  ak  during  [t2,t2  +  Kc) 

—  By  the  definition  of  wait  in  A  during  [0,  oo),  Cnon(erm[0/r0]  leads  to 

no  {all,al}  during  [0,  oo),  and  thus  <  oo:- 1  wait  to  all  at  tx  until  comm. 

—  V  ?=iC,[0/to]  implies  3k3t  <  oo:Cl  [0/t0],  and  thus,  by  the  definition  of  wait 
in  A  during  [0,  t ),  3k3t  <  co  :no {all,  al}  during  [0,  t)  A  comm  via  ak  during  [r,  t  +  Kc). 
Thus,  for  all  tx  <  oo,  wait  to  all  at  tx  until  comm  implies  tx^t,  and  hence 
3t  ^  tx:comm  via  ak  during  [t,  t  +  Kc). 

•  Next  we  prove  Cf ,  that  is, 

Vf3  <  oo:~~i  wait  to  ak2  at  t3-*3tA  <t3  +  KA:wait  to  all  at  tA  until  comm 

—  From  Cnomerm[0/fo],  we  obtain  V/e{l,...,n}:wait  to  a?,  during  [0,  oo), 
and  hence  Vr3  <  ooVke{\,...,n}:wait  to  ak2  at  t3. 

—  Assume  — i  wait  to  akl  at  t3,  for  t3  <  oo.  By  Ck [0/ro],  there  exists  a  t  <  oo  such 
that  wait  to  akl  during  [0,  r)  and  wait  to  all  at  t  +  Kc  until  comm.  Then  t  ^  r3, 
thus  t  +  Kc  ^  t3  +  Kc,  and  hence  3 14  ^  t3  +  KA:wait  to  all  at  tA  until  comm.  This 
leads  to  Cf ,  provided  Kc<  KA. 
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Implementation  of  Wp  Next  we  implement  Wt  by  *  [re, ?->  skip  0  delay  vi-*ai\']  and 
show  that,  under  certain  restrictions,  this  program  satisfies  the  required  specification 
cr-  A  C^i:{time  =  0}  Wftrue}.  Define 

C1{t1)  =  wait  to  a,!  at  tx  until  comm^> 

3 12  ^  t^.wait  to  ref!  during  (t2,t2  +  vf, 

C2(t3 )  =  no  comm  via  rei  during  (t3,t3  +  vf)-* 

3 14  ^  t3  +  Vi  3-  Kpwait  to  af.  at  tA  until  comm. 

Then 

Cf '  =  Vr j  <  and  Cf‘  =  Vt3  <  oo:C2(t3). 

We  apply  the  iteration  rule  with 

C  nonterm  —  CT'  A  Cf\  and  C  =  <  time:C1{t1)  A  Vt3  <  time:C2(t3). 

If  the  assumptions  for  the  iteration  rule  are  fulfilled  -  which  is  shown  below  -  we 
obtain 


Cnonterm  A  time  =  co  i  {C}  *  [ref!  -*■  skip  □  delay  v{  -» at !]  {false}. 

Since  Cnonterm  A  time  =  co-+  C^‘  A  C2‘,  time  =  0  -*■  C,  and  false  -*  true,  the  consequence 
rule  leads  to  A  C'f'ftime  =  0}  Wftrue). 

To  apply  the  iteration  rule  we  have  to  prove 

(Vr<  cc3t0>t:C[t0/time])-^Cnonterm,  (5) 

C: {C} [ref! ->  skip Q delay  w(-»aj!]{C}.  (6) 

Proof  of  (5).  Observe  that  (Vt  <  cc3t0>  t:C[t0/time ])  = 

(Vt  <  co3r0  >  t:(Vt1  <  to’.Cft^A  Vt3  <  t0:C2(t3)))-+ 

(Vrx  <co:Cftl)  A  Vt3  <  oo:C2(r3))  =  Cnonterm,  and  hence  (5)  holds. 

Proof  of  (6).  Consider  C®  =  A  Vf3  <  f5:C2(r3),  and 

Cp  =  Vf1?  t5^t1  <  time:C1(t1)  A  Vt3,  t5^t3<  time:C2(t3).  Then  C<-»C®  A  CA 
Below  we  derive 

Cp:{time  =  t5}[rc;?->  skip  Q delay  vi^af]{Cl>}.  (7) 

From  (7)  we  obtain  by  the  adaptation  rule,  with  p  =  C®  and  exp  =  t5, 

C®  A  CA{C®  A  time  —  t5 }[re,?->-  skip Q delay  i;i->-a[!]{Ca  A  Cp}. 

Using  C«->(C“  A  C%  by  the  consequence  rule, 

C:{C®  A  time  =  t5 }  [re,?  -*■  skip  Q delay  i;,  ->-a,!]{C}. 

Since  C  ^>(3t5:C[t5time~]  A  time  =  r5)-^(3t5:Ca  A  time  =  t5),  the  quantifiction  rule 
and  the  consequence  rule  lead  to  (6). 
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Proof  of  (7).  Let  G  =  [ref?  — *•  skip [] delay 

Apply  the  rule  for  guarded  command  with  delay,  using 

Pi  =  3f,  t5  ^  t  <  t5  +  Vi',  wait  to  ref.  during  [£5,  t)  A  comm  via  ret  during  [£,  t  +  Kc )  A 
no  {a,,  a£ !}  during  [fs,£  +  Kc)  A  time  =  t  +  Kc. 

Then  It,  t5  ^  £  <  t5  +  vp.wait  in  G  during  [£5,£)  A  comm  re,  in  G  from  t->px,  thus  we 
can  derive  (7),  provided 

C^pJskiplC*},  (8) 

Cp:{wait  in  G  during  [£5,  t5  +  vx)  A  time  =  t5  +  } af !  {Cp}.  (9) 

Proof  of  (8).  Derive  by  the  skip  axiom  time  =  t0:{time  =  £0}skip{fi'me  =  £0}. 

Then  by  the  adaptation  rule,  with  exp  =  t  +  Kc  and 

p  =  t5  ^  t  <  t5  +  A  wait  to  reft  during  [f5,  t)  A  comm  via  ret  during  [t,  t  +  Kc)  A 
no{ai,ai\}  during  [£5,£  +  Kc), 

we  obtain  p  A  time  —  £  +  Kc:{p  A  time  =  t  +  Kf}skip{p  A  time  =  t  +  Kc}. 

By  the  consequence  rule  and  the  quantification  rule  we  obtain  px  :{px } skip {px }. 
Observe  that 


p  i=>Vti,  £5  ^  £  x  <  time:— \  wait  to  a,!  at  tx  A  comm  via  ax  at  tx 

=>  ,  f5  ^  £j  <  time',  i  wait  to  ax\  at  tx  until  comm 

=>V£j,  £5  ^  tx  <  time:Cx{tx), 

and 

Pi  =>  3f,  f5  ^  £  <  £s  +  vp.  comm  via  rex  during  [£,  £  +  Kc)  A  time  =  £  +  Kc 
=>Vf3,  £5  ^  £3  <  time  3 £,  £3  ^  £  <  £3  +  vp.comm  via  rex  at  t 

=> Vf3,  £5  ^  £3  <  time:~\  no  comm  via  rex  during  (£3,  f3  +  c;) 

^V£3,  t5^ti<time:C2(t3). 

Hence  px  -►C/?,  and  thus  the  consequence  rule  leads  to  (8). 


Froo/  0/  (9).  Define 

Cfl  =  3£  ^  £5  4-  vp.wait  to  axl  a£  £5  +  vx  until  comm  at  t  A  fimc  =  £  +  Kc. 

From  the  send  rule,  the  conseqence  rule  and  the  substitution  rule  (replacing  t0  by 
£5  +  tij)  we  obtain  Ca:{time  —  t5  +  v^a^C0  A  time  <  00}. 

Define  Cb  =  wait  to  ref.  during  [t5,£5  +  vf  A  no  during  [£5,£5  +  vf. 

Then  by  the  adaptation  rule  we  can  derive 

Ca  A  Cb:{time  =  t5  +  vt  A  Cfr}ai!{Ca  AC*  A  time  <  00}. 

Since  wait  in  G  during  [£5,  t5  +  vf  A  time  =  £ 5  +  vt  -+  time  =  t5  +  i>,-  A  Cfc,  we  obtain  (9) 
by  the  consequence  rule,  if  Ca  A  Cb  implies  Cp.  Recall  that 

Cp  =  (Vtx,  t5^tx  <time:wait  to  ax\  at  tx  until  comm -* 

3 12  ^  tx:wait  to  ref.  during  (t2,t2  +  f,))  A 

(Vf 3,  £5  ^  £3  <  time.no  comm  via  re{  during  (£3,£3  +  t?f)  — > 

3£4  ^  £3  +  +  Kuwait  to  ax\  at  £4  until  comm). 
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It  remains  to  prove  Ca  A  Cb  ->  Cp: 

•  For  all  t1,ts^tl<  time:  if  wait  to  a,!  at  t1  until  comm,  then  from  Ca  A  Cb,  ^  t5  +  v(. 
Hence,  from  Cb,  there  exists  a  t2  <  tA  (viz.,  t5)  such  that  wait  to  ref.  during  {t2,t2  +  u,-)> 
provided  vt  >  0. 

•  Assume,  for  all  t3,  t5^t3<  time.no  comm  via  ret  during  (t3,t3  +  vf.  From  Ca  we 
obtain  wait  to  a,  !  at  t5  +  vt  until  comm).  Hence,  there  exists  a  t4  ^  t3  +  u,  -f  K,  such 
that  wait  to  at\  at  t4  until  comm)  if  t5  +  r>,  ^  t3  +  vt  +  K{.  Since  t5^t3,  this  holds 
provided  0. 

Conclusion:  By  giving  programs  that  implement  A  and  W{,  we  have  obtained  an 

implementation  that  satisfies  the  top-level  specification  for  a  watchdog  timer  as  given 
in  §6.1.  To  conclude  this  example,  we  analyse  the  requirements  which  have  been 
imposed  upon  the  constants  K,  Kh  KA  and  vt  to  prove  the  correctness  of  our 
implementation.  To  justify  the  refinement  step  from  the  previous  section,  we  required 
Ki  +  Ka^  K,  for  all  i  =  1  ,...,n.  The  implementation  given  in  this  section  has  been 
proved  to  satisfy  the  specification  for  all  KA  and  such  that  K  <  KA  and  K{  ^  0, 
and  provided  v{  >  0,  for  all  i  =  1, . . . ,  n.  Observe  that  if  K  >  Kc  then  for  KA  =  K  and 
Kf  =  0  we  have  Kc  <  KA  and  K{  ^  0,  and  also  K(  +  KA  =  K  ^  K.  Hence,  if  vt  >  0  for 
all  i  =  1,. . . ,  n,  and  K  >  Kc,  i.e.,  the  constant  in  the  specification  must  be  greater  than 
the  duration  of  a  communication,  then  our  implementation  meets  the  top-level 
specification  for  W. 


7.  Adding  assumptions  to  real-time 

In  general,  real-time  embedded  systems  have  an  intensive  interaction  with  their 
environment,  and  usually  their  correctness  strongly  depends  on  assumptions  about 
the  environment.  Therefore  it  is  convenient  to  use  correctness  formulae  in  which  these 
assumptions  can  be  expressed.  Hence,  similar  to  §  4,  we  extend  the  formulae  C:{p}S{q} 
from  §5  with  a  fourth  assertion,  called  assumption,  leading  to  formulae  of  the  form 
(A,  C):{p}S{q}.  Assumption  A  should  neither  contain  program  variables  nor  the 
special  variable  time. 

In  this  assumption/commitment  formalism  for  real-time  properties,  we  can  now 
express,  for  instance,  in  the  assumption  when  the  environment  of  a  process  is  waiting 
to  communicate.  With  such  an  assumption  we  can  determine  when  the  communication 
must  take  place.  For  example, 

(A  =  wait  to  cl  at  5  until  comm  A  no  comm  via  c  during  [3, 5), 

C  =  comm  via  c  during  [5, 5  +  Kc)): 

{time  =  3}  cl  {time  =  5  +  Kc). 

Note  that,  by  using  the  maximal  parallelism  model,  a  communication  takes  place  as 
soon  as  both  process  and  environment  are  ready  to  perform  the  communication. 

In  the  remainder  of  this  section  we  indicate  how  the  proof  system  from  §  5  can  be 
extended  to  obtain  a  compositional  proof  system  for  these  assumption/commitment- 
formulae.  First  we  consider  a  compositional  rule  for  the  parallel  composition  of 
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and  S2.  Concerning  pre-conditions,  post-conditions  and  commitments,  the  rule  is 
identical  to  rule  5.27  for  the  commitment-formalism.  For  the  assumptions,  we  have 
same  requirements  as  in  rule  4.2:  A  A  Cx  -*A2  and  A  A  C2-*Al. 

Rule  7.1.  (Parallel  composition) 

i»  C71):{p1  }SX  {^i},  {A2,  C2): (p2 }S2 {^2} 
q1[tl/time']  A  q2[t2/time~\  A  time  —  max{tl,t2)->q 
C,[_tjtime-]  A  C2{t2/time']  A  time  =  maxftj,  t2)-+  C 
A  A  Ct->A2,  A  A  C2^Al _ ’ 

(A,Q:{plAp2}Sl\\S2{q} 

provided  tx  and  t2  are  fresh  logical  variables,  and  dch{Ci,qi)  £  dch(Si),  for  ie{l,2}. 
A  typical  application  of  this  rule  can  be  found  in  the  next  example. 

Example  7.1.  Consider  the  following  specifications  (delay  d  is  used  to  represent  any 
internal  actions  which  takes  d  time  units  and  assume  communications  take  one  time 
unit,  i.e.,  Kc=  1): 

(Aj  =  wait  to  c?  at  2  until  comm  A  no  comm  via  c  during  [0,2)  A 

wait  to  dl  at  6  until  comm  A  no  comm  via  d  during  [3, 6), 

(Ct  =  comm  via  c  during  [2, 3)  A  wait  to  d\  at  3  until  comm  A 
no  comm  via  d  during  [0,3)): 

{ time  =  0}  c!;  dl;  delay  2  { time  —  9},  and 
(A 2  =  wait  to  d\  at  3  until  comm  A  no  comm  via  d  during  [0, 3), 

C2  =  wait  to  dl  at  6  until  comm  A  no  comm  via  d  during  [0, 6)  A 

comm  via  d  during  [6,7)): 

{time  =  0}  delay  6;  dl  {time  =  7}. 

Take  for  (c!;  d\;  delay  2)  ||  (delay  6;  dl)  the  following  assumption: 

A  =  wait  to  cl  at  2  until  comm  A  no  comm  via  c  during  [0, 2). 

Since  A  A  Cx  -♦  A2  and  A  A  C2->  Ax,  the  parallel  composition  rule  leads  to 

(A,Ci  A  C2):{time  =  0}  (c!;  dl;  delay  2) || (delay  6;  dl)  {time  =  9}. 

Using  a  consequence  rule  we  can  easily  derive  from  A  C2  the  following  commitment: 
comm  via  c  during  [2, 3)  A  comm  via  d  during  [6, 7).  □ 

Similar  to  section  4,  to  achieve  a  sound  rule  for  parallel  composition,  an  inductive 
relation  between  assumption  and  commitment  is  necessary  to  avoid  circularity.  In 
our  real-time  specifications  we  require  for  the  validity  of  ( A,C):{p}S{q }  that  there 
exists  a  <5  >  0  such  that 

1.  for  all  t  with  0  ^  t  <  S:  C  holds  at  f,  and 

2.  for  all  t  <5:  if  A  holds  at  t  —  S  then  C  holds  at  t. 


Compositional  methods  for  concurrency 


67 


In  our  examples  this  requirement  is  fulfilled  if  comm  via  D  during  [t0>fi)  is  not 
considered  as  an  abbreviation  but  as  a  primitive  which  is  trivially  true  at  all  points 
of  time  before  t1. 

Other  rules  and  axioms  for  the  assumption/commitment  formalism  can  be  obtained 
by  adapting  the  commitment-based  proof  system  of  the  previous  section.  We  simply 
add  assumption  true  to  the  rule  and  axioms  for  atomic  statements,  and  the  proof 
system  is  extended  by  a  rule  that  allows  the  addition  of  an  assumption  to  strengthen 
commitment  and  post-condition.  To  formulate  this  rule,  let  p@t  denote  assertion  p 
at  time  t,  ignoring  the  part  of  p  that  refers  to  point  of  time  after  t.  (By  definition, 
(p@t)  =  true  if  t  <  0.)  Then,  for  all  S  >  0,  we  have  the  following  rule. 

Rule  7.2.  (Strengthen) 

Vt:(A@(t  —  <5))  A  (C1@£)->(C@t) 

Vt:(A@(t-  5))  A  (q1@t)-+(q@t) 

(Ai  AA,C)-{Pi}S{Pl} 

Furthermore,  the  rules  for  compound  statements  have  to  be  adapted,  and  in  the 
consequence  rule  we  require  that  all  implications  hold  point-wise. 

Rule  7.3.  (Consequence) 

(A1,C&-{Pi}S{ql} 

Vf.{A@t)-+{A1@t),  yt:(p@t)  A  ( time  <  co)-^{p1@t) 

Vf:(C1@0->(C@Q,  W:(<h@t)->(<?@t) 

(A,C):{P}S{q} 


8.  Related  work  and  state  of  the  art 

For  concurrent  programs  communicating  via  message  passing  as  well  as  for  shared 
variable  concurrency,  one  can  observe  a  development  from  non-compositional  proof 
methods  which  require  the  (final)  program  text  for  their  application,  such  as  Owicki  & 
Gries  (1976),  Apt  et  al  (1980)  and  Levin  &  Gries  (1981),  towards  compositional  theories, 
e.g.  Chen  &  Hoare  (1981,  pp.  1-12),  Soundararajan  (1984b),  Stirling  (1986,  pp.  407-415), 
Zwiers  (1989)  and  Stolen  (1990)  (see  de  Roever  1985b,  pp.  181-207,  and  Hooman  et  al 
1986  for  an  overview  of  this  development).  An  early  Indian  pioneer  in  compositional 
proof  methods  for  concurrency  is  Soundararajan  (Soundararajan  1984;  Sobel  & 
Soundararajan  1985,  pp.  343-359).  Whereas  these  methods  verify  only  safety  properties, 
with  temporal  logic  (Pnueli  1977,  pp.  46-57;  Manna  &  Pnueli  1982,  pp.  163-255) 
also  liveness  (progress)  properties  can  be  verified.  Compositional  proof  systems  for 
temporal  logic  have  been  given  in  Barringer  et  al  (1984,  pp.  51-63)  and  Nyugen  et  al 
(1986).  In  Pandya  &  Joseph  (1991)  a  compositional  proof  system  called  P-A  logic  (for 
presupposition-affirmation  logic)  is  described  for  establishing  weak  total  correctness 
and  weak  divergence  correctness  of  CSP-like  distributed  programs  with  synchronous 
and  asynchronous  communication.  This  extension  allows  compositional  deadlock 
proofs  and,  moreover,  compositional  proof  rules  are  given  for  unn'Z-properties  of  the 
form  Q  until  R,  where  Q  and  R  are  assertions  over  communication  traces.  It  seems 
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that  this  paper  describes  how  far  one  can  go  towards  proving  liveness  in  a 
compositional  framework  using  a  non-temporal  formalism  in  an  assumption-commit¬ 
ment  based  setting. 

Interestingly,  more  involved  programming  language  fragments  for  concurrency, 
such  as  the  concurrency  fragment  of  Ada  (1983)  or  those  for  monitor  based  languages, 
have  not  been  characterized  until  now  through  compositional  trace-based  methods, 
although  their  non-compositional  characterization  was  possible  and  has  been 
given  -  see  de  Roever  (1985a,  pp.  213-260)  for  an  overview.  However,  if  one  allows 
locations  inside  programs  as  observables,  there  exists  a  straightforward  technique  to 
convert  non-compositional  proof  methods  for  concurrency  to  compositional  ones,  as 
reported  in  Gerth  &  de  Roever  (1986).  Similarly,  Stirling  (1986,  pp.  407-416)  reports 
how  a  basically  non-compositional  proof  method  for  shared  variable  concurrency, 
such  as  the  one  by  Owicki  &  Gries  (1976),  can  be  reformulated  compositionally  within 
a  framework  based  on  relevance  logic.  Also  for  the  parallel  object  oriented  language 
POOL  first  non-compositional  proof  methods  (America  &  de  Boer  1990)  have  been 
developed  based  on  the  method  of  Apt  et  al  (1980).  The  principal  author,  de  Boer, 
recently  reformulated  his  proof  system  along  history-based  compositional  lines  (de 
Boer  1991)  using  the  work  of  Zwiers  (1989)  as  a  starting  point. 

In  the  present  paper  we  discuss  a  compositional  proof  system  for  distributed  message 
passing  in  which  assumptions  can  be  made  about  the  behaviour  of  the  environment 
in  the  style  of  Misra  &  Chandy  (1981)  and  Zwiers  et  al  (1984).  The  main  idea  of  the 
method  is  that  suitable  assumptions  about  the  environment  reduqe  the  immence 
number  of  possible  behaviours  of  complex  reactive  systems.  Misra  &  Chandy  (1981) 
were  the  first  ones  to  demonstrate  the  advantages  of  assumptions  in  the  hierarchical 
design  and  verification  of  distributed  processes  with  message  passing.  They  proposed 
a  compositional  rule  for  the  parallel  operator  and  demonstrated  their  method  on 
several  examples.  These  ideas  have  been  formalized  by  Zwiers  et  al  (1984),  resulting 
in  a  compositional  proof  system  for  assumption/commitment  based  specifications 
together  with  its  soundness  and  completeness  proof.  The  examples  in  Ossefort  (1983) 
show  that  the  Misra-Chandy  method  is  easy  to  use  indeed  and  that  it  leads  to 
simple  and  natural  correctness  proofs.  In  Pandya  (1988)  and  Pandya  &  Joseph  (1991), 
the  formalism  of  Zwiers  et  al  (1984)  is  extended  to  asynchronous  communication  and 
progress  properties.  Also  related  is  the  formalism  given  by  Stark  (1985,  pp.  369-391), 
who  uses  rely-  and  guarantee-conditions  for  deriving  global  liveness  properties  of  a 
distributed  system.  Interestingly,  at  present  new  non-standard  applications  of  the 
assumption-commitment  framework  are  burgeoning,  e.g.,  for  characterizing  specifica¬ 
tions  of  fault-tolerant  processes  or  within  development  methods  for  mutual  exclusion 
algorithms  in  which  the  final  algorithm  is  obtained  by  a  series  of  error  containing 
approximations  in  which  errors  are  gradually  removed  until  all  are  absent,  see  e.g. 
Cau  &  Kuiper  (1991).  A  compositional  theory  for  action  refinement  is  developed  by 
Jannsen  et  al  (1991),  in  a  setting  of  partial  orders,  and  this  theory  is  applied  to  proving 
the  correctness  of  distributed  databases  by  formalizing  the  notion  of  serializability. 
It  contains  an  example  of  how  a  general  specification  using  an  unbounded  number 
of  processors  is  refined  to  an  implementation  using  two  processors.  Recently,  a 
number  of  developments  in  assumption/commitment  based  reasoning  have  taken 
place;  for  the  state  of  the  art  consult  the  papers  by  Pandya  (1989,  pp.  622-640)  and 
Abadi  &  Lamport  (1989,  pp.  1-41). 

The  present  paper  focusses  on  concurrent  processes  with  synchronous  message 
passing  along  channels.  What  about  compositional  approaches  to  shared-variable 
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concurrency  and  asynchronous  message  passing?  As  to  shared-variable  concurrency, 
a  basic  observation  is  made  by  Aczel  (as  reported  in  de  Roever  1985b,  pp.  181-207), 
in  which  in  the  semantics  of  a  process  a  distinction  is  made  between  a  so-called 
“component  action”  n  (an  action  of  the  process  itself)  and  an  “environment  action” 
E  (an  update  to  a  shared  variable  by  another  process).  Then  (x,  II)  and  (x,  E)  are  the 
analogues  of  the  communication  records  in  the  channel-based  theory  above.  The 
original  reference  in  which  these  notions  where  informally  introduced  is  Jones  (1981); 
Jones  (1983)  is  a  more  accessible  reference  to  these  ideas  and  contains  a  proposal  for 
assumption/commitment  based  reasoning  (called  rely/guarantee  reasoning)  about 
shared  variable  concurrency  together  with  the  formulation  of  a  compositional  proof 
rule  for  the  parallel  operator.  The  idea  is  exemplified  in  Woodcock  &  Dickinson 
(1988)  and  formally  worked  out  in  Stolen  (1990).  In  a  mixed  (temporal  logic)-(transition 
system)  based  approach,  asynchronous  communication  is  characterized  composi- 
tionally  in  the  work  by  Jonsson  (1987a,  1987b.  pp.  152-166).  It  is  shown  (de  Boer 
et  al  1990)  that  for  a  compositional  description  of  any  programming  language  based 
upon  asynchronous  communication  a  trace  model  is  sufficient,  i.e.,  no  additional 
structures  to  encode  some  relevant  branching  information  (trees,  failure  sets)  are 
needed.  A  compositional  axiomatization  is  given  (Hooman  et  al  1990,  pp.  242-261, 
1992)  for  the  graphical  specification  language  Statecharts  which  includes  features  like 
concurrency,  broadcast  communication,  and  time-out. 

A  dichotomy  is  observed  (Zwiers  &  Roever  1989)  in  compositional  proof  theories 
for  concurrency;  one  class  of  methods  (including,  e.g.,  temporal  logic  and  VDM),  is 
based  on  programs  as  predicates,  and  has  a  simple  proof  theory  (due  to  the  power 
of  the  consequence  rule),  but  has  trouble  in  characterizing  sequential  composition 
and  iteration.  Methods  in  the  other  class  (including  weakest  pre-condition  calculi, 
Hoare  triples  and  dynamic  logics),  are  based  on  programs  as  predicate  transformers, 
and  have  no  trouble  in  dealing  with  sequential  composition  and  iteration,  but  are 
more  complicated  due  to  awkward  implication  rules.  An  attempt  at  unification  is 
made  (Zwiers  &  de  Roever  1989)  by  using  adjoints. 

Except  for  Statecharts,  these  methods  are  not  designed  to  verify  and  specify  real-time 
properties.  Now  an  obvious  approach  towards  a  verification  theory  for  real-time 
programs  is  to  adapt  and  extend  an  already  existing  method  which  does  not 
incorporate  any  notion  of  time.  For  instance,  in  traditional  linear  temporal  logic, 
safety  and  liveness  properties  are  expressed  by  means  of  a  qualitative  notion  of  time 
(e.g.  “eventually”,  “henceforth”,  “until”).  In  order  to  express  real-time  constraints, 
extensions  of  this  logic  have  been  proposed  (Bernstein  &  Harter  1981,  pp.  1-11; 
Shankar  &  Lam  1987;  Koymans  1989)  which  also  includes  a  quantitative  notion  of 
time  (e.g.  “eventually  within  5  time  units”,  “always  after  7  time  units”).  These  extensions 
have  been  applied  to  the  specification  of  real-time  communication  properties  of  a 
transmission  medium  (Koymans  et  al  1983,  pp.  187-197)  and  the  verification  of  local 
area  network  protocols  (Shasha  1984,  pp.  54-65).  A  compositional  proof  theory  for 
real-time  distributed  message  passing  using  an  assertion  language  based  on  real-time 
temporal  logic  has  been  given  in  Hooman  &  Widom  (1989,  pp.  424-441).  In  Hooman 
(1991b)  this  compositional  method  is  extended  to  uniprocessor  implementations  and 
priorities.  Non-compositional  proof  methods,  based  on  Manna  &  Pnueli’s  (1982,  pp. 
163-255)  classical  approach  to  linear  time  temporal  logic,  can  be  found  in  Harel 
(1988)  and  Ostrolf  (1989).  They  express  real-time  properties  in  explicit  clock  temporal 
logic  and  give  decision  procedures  for  this  logic. 

Similarly,  real-time  extensions  have  been  formulated  for  other  methods.  There  is 
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an  early  paper  of  Haase  (1981)  in  which  time  is  introduced  by  a  special  variable  in 
the  weakest  pre-condition  calculus.  Bernstein  (1987)  discusses  several  ways  of 
modelling  message  passing  with  time-out  in  the  non-compositional  framework  of 
Levin  &  Gries  (1981).  Zwarico  &  Lee  (1985,  pp.  169-177)  have  adapted  Hoare’s  (1985) 
trace  model  (with  one  invariant  and  a  satisfaction  relation)  to  real-time.  Nested 
parallelism  is  not  allowed  in  their  programming  language,  a  restricted  version  of 
sequential  composition  is  used,  and  there  is  no  explicit  mechanism  for  expressing 
time  constraints.  A  real-time  logic  to  analyze  safety  properties  is  defined  (Jahanian  & 
Mok  1986)  based  on  a  function  which  assigns  a  time  value  to  each  occurrence  of  an 
event.  Real-time  properties  of  sliding  window  protocols  are  verified  by  Shankar  & 
Lam  (1987)  using  special  state  variables,  called  timers,  to  measure  the  passage  of  time. 
The  compositional  proof  system  from  Davies.  &  Schneider  (1989,  pp.  129-159)  and 
Schneider  (1990)  for  timed  CSP  supports  semantic  reasoning  in  the  framework  of 
Reed  &  Roscoe  (1986,  pp.  314-323).  Furthermore,  Schneider  (1990)  defines  a  notion 
of  time-wise  refinement  to  transfer  properties  of  non-timed  CSP  programs  to  their 
timed  version,  thus  exploiting  the  hierarchy  of  timed  and  untimed  models  from  Reed 
(1989,  pp.  80-128).  Beaten  &  Bergestra  (1990)  have  incorporated  real-time  aspects  in 
the  process  algebra  of  Bergestra  &  Klop  (1984)  by  adding  time  stamps  to  atomic 
actions.  In  their  approach  atomic  actions  have  a  positive  duration,  whereas  in  the 
process  algebra  of  Nicollin  et  al  (1990,  pp.  402-429)  actions  have  no  duration  in 
general,  except  a  distinguished  time  action  which  models  the  ticks  of  a  synchronized 
global  clock.  Furthermore,  Nicollin  et  al  (1990,  pp.  402-429),  present  a  systematic 
approach  to  delay-constructs.  Milner’s  (1989)  CCS  is  extended  by  Moller  &  Tofts 
(1990,  pp.  401-415)  and  Yi  (1990,  pp.  502-520)  with  explicit  time.  To  obtain  a  calculus 
for  shared  resources,  in  Gerber  &  Lee  (1990,  pp.  263-277)  a  priority-based  process 
algebra  is  presented. 


This  work  was  partially  supported  by  Esprit-BRA  project  3096:  Formal  Methods  and 
Tools  for  the  Development  of  Distributed  and  Real-Time  Systems  (SPEC). 
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Abstract.  In  this  paper,  we  develop  a  compositional  denotational  semantics 
for  prioritized  real-time  distributed  programming  languages.  One  of  the 
interesting  features  is  that  it  extends  the  existing  compositional  theory 
proposed  by  Koymans  et  al  (1988)  for  prioritized  real-time  languages 
preserving  the  compositionality  of  the  semantics.  The  language  permits 
users  to  define  situations  in  which  an  action  has  priority  over  another 
action  without  the  requirement  of  preassigning  priorities  to  actions  for 
partially  ordering  the  alphabet  of  actions.  These  features  are  part  of  the 
languages  such  as  Ada  designed  specifically  keeping  in  view  the  needs  of 
real-time  embedded  systems.  Further,  the  approach  does  not  have  the 
restriction  of  other  approaches  such  as  prioritized  internal  moves  can 
pre-empt  unprioritized  actions  etc.  Our  notion  of  priority  in  the  environment 
is  based  on  the  intuition  that  a  low  priority  action  can  proceed  only  if  the 
high  priority  action  cannot  proceed  due  to  lack  of  the  handshaking  partner 
at  that  point  of  execution.  In  other  words,  if  some  action  is  possible 
corresponding  to  that  environment  at  some  point  of  execution  then  the 
action  takes  place  without  unnecessary  waiting.  The  proposed  semantic 
theory  provides  a  clear  distinction  between  the  semantic  model  and  the 
execution  model  -  this  has  enabled  us  to  fully  ensure  that  there  is  no 
unnecessary  waiting. 

Keywords.  Compositional  specification;  real-time  distributed  systems; 
priority  specification;  message  passing  models. 


1.  Introduction 

Many  approaches  have  been  proposed  for  the  modelling  of  communicating  agents 
(Milner  1980),  reactive  systems  (Pnueli  &  Harel  1988,  pp.  84-98),  and  real-time 
distributed  systems  (Roscoe  1984;  Koymans  et  al  1985,  1988).  Most  of  the  above 
studies  have  ignored  the  notion  of  priority.  This  is  not  satisfactory  as  priority  is  very 
important  in  the  development  of  predictable  systems  (Stankovic  1988).  Priority 
specification  is  required  when  one  or  more  events  happen  at  the  same  time  and  some 
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events  have  greater  importance  than  others.  Typical  examples  of  actions  which  require 
special  treatment  include  interrupts  in  hardware  systems  and  timeouts  in  communica¬ 
tion  protocols.  Ada  and  Occam  are  two  examples  of  programming  languages  that 
allow  specification  of  priority.  One  can  broadly  distinguish  priority  specification  into 
the  following  categories: 

(1)  Partial  order  on  the  alphabet  of  actions/events. 

(2)  Piiority  specification  through  evaluating  expressions  dynamically  and  assigning 
priorities  to  actions/statements. 

(3)  Priority  of  actions  depending  on  the  environment. 

Perhaps  the  first  formal  study  of  priorities  has  been  done  in  the  context  of  process 
algebra  in  Baeten  et  al  (1985).  In  this  study,  prioritization  operators  have  been  added 
and  the  consistency  of  the  set  of  equations  have  been  studied.  In  other  words,  the  study 
falls  into  type  (1)  described  above.  That  is,  it  assumes  a  global  partial  ordering  of  the 
actions  (of  the  transition  system).  From  a  behavioural  equivalence  point  of  view,  a 
study  has  been  made  in  Cleaveland  &  Hennessy  (1990)  through  the  study  of  priority 
operators  of  type  (1)  in  the  context  of  calculus  of  communicating  systems  (CCS). 
Congruence  has  been  obtained  through  the  notion  of  patient  processes  by  placing 
essentially  the  restriction  that  only  prioritized  internal  actions  have  the  pre-emptive 
power.  The  desirability  of  overcoming  this  restriction  follows  from  the  examples 
discussed  in  Cleaveland  &  Hennessy  (1990). 

A  study  of  the  second  type  of  priority  has  been  made  in  Hooman  (1989)  in  the 
context  of  the  study  of  distributed  multi-processing.  Here,  priorities  are  attached  to 
statements  and  the  priorities  of  statements  on  different  processors  are  incomparable. 
In  other  words,  the  relative  ordering  of  priorities  on  a  single  processor  determines 
the  execution  order;  for  example,  if  two  synchronized  actions  have  different  priorities, 
then  the  priority  for  the  synchronized  action  is  given  by  the  minimum  of  the  two. 
Another  way  of  overcoming  this  problem  is  to  use  one-way  priority,  that  is,  either 
the  input  or  output  action  can  be  assigned  priority  but  not  both. 

Let  us  look  at  the  specification  of  reactive  systems.  Reactive  systems  maintain  a 
continuous  interaction  with  their  environment  at  a  speed  determined  by  the 
environment  rather  than  the  program  itself.  In  other  words,  the  outputs  may  affect 
future  inputs  due  to  feedback.  One  of  the  primary  goals  of  the  study  of  reactive 
(real-time)  systems  is  to  develop  predictable  systems  (cf.  Stankovic  1988).  Thus,  it 
becomes  essential  to  predict  the  action  of  each  component  in  the  context  of 
nondeterministic  interacting  environment.  For  this  purpose,  priority  plays  a  vital  role. 
Hence,  priority  of  type  (3)  discussed  above  plays  an  important  role  in  distributed 
reactive  systems.  Let  us  look  at  the  priority  specification  characteristics  in  Ada  which 
has  been  designed  keeping  in  view  the  design  of  real-time  embedded  systems.  In  Ada 
each  task  may  (but  need  not)  have  a  priority.  We  quote  below  relevant  aspects  from 
the  revised  Ada  manual  (cf.  9-8)  given  in  Gehani  (1983): 

•  The  specification  of  priority  is  an  indication  given  to  assist  the  implementation  in 
the  allocation  of  processing  resources  to  parallel  tasks  when  there  are  more  tasks 
eligible  for  execution  than  can  be  supported  simultaneously  by  the  available 
processing  resources.  The  effect  of  priorities  on  scheduling  is  defined  by  the  following 
rule: 

If  two  tasks  with  different  priorities  are  both  eligible  for  execution  and  could  sensibly 
be  executed  using  the  same  physical  processors  and  the  same  other  processing 
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resources,  then  it  cannot  be  the  case  that  the  task  with  the  lower  priority  is  executing 
while  the  task  with  the  higher  priority  is  not. 

•  The  above  rule  essentially  corresponds  to  the  pragma  feature.  The  most  important 
aspect  related  to  priority  specification  in  the  context  of  a  rendezvous  is  cited 
below: 

For  tasks  of  the  same  priority,  the  scheduling  is  not  defined  in  the  language. ...  The 
priority  of  the  task  is  static  and  therefore  fixed.  However,  the  priority  during  a 
rendezvous  is  not  necessarily  static  since  it  also  depends  on  the  priority  of  the  task 
calling  the  entry. 

It  can  be  seen  from  the  above  that  priority  specification  in  Ada  corresponds  to  that 
of  type  (3)  described  above.  In  appendix  A,  we  illustrate  through  an  example  priority 
specification  in  Ada  by  considering  the  problem  of  minimizing  head  movement  during 
a  disc  access.  Priority  specifications  corresponding  to  type  (3)  given  above  play  an 
important  role  in  the  specification  of  process-control  systems. 

A  limited  treatment  of  priority  of  type  (3)  has  been  reported  in  Pitassi  et  al  (1986). 
The  study  in  Pitassi  et  al  (1986)  corresponds  to  a  special  case  of  the  prioritized 
alternative  construct  of  Occam  (cf.  Occam  1984).  The  limitation  can  be  understood 
by  the  informal  interpretation  of  the  alternative  construct  given  below:  Upon  entering 
a  prioritized  alternate  (ALT)  statement,  a  linear  sequence  of  all  the  open  guards  of 
the  prioritized  statement  is  constructed;  if  one  or  more  of  the  open  guards  is  successful 
upon  entry,  then  the  first  successful  guard  in  that  sequence  is  selected  and  the 
corresponding  statement  is  executed.  If  none  of  the  open  guards  is  successful  upon 
entry,  then  the  prioritized  ALT  construct  is  treated  as  if  it  were  an  ordinary  ALT 
construct. 

So  far  in  the  literature,  there  is  no  indication  as  to  what  approach  would  be  realistic 
and  useful.  In  this  paper,  we  provide  a  formal  semantics  for  priority  in  the  context 
of  real-time  distributed  programming  languages  that  have  the  feature  of  specifying 
priority  in  an  environment  rather  than  providing  a  global  partial  order  of  actions  in 
the  system.  The  main  contributions  of  the  paper  can  be  summarized  as  follows. 

•  An  understanding  of  the  notion  of  priority  in  the  context  of  environment  (i.e., 
real-time  distributed  concurrency)  is  provided.  Our  approach  does  not  have  the 
restriction  (as  in  Cleaveland  &  Hennessy  1990)  that  only  prioritized  internal  moves 
can  pre-empt  unprioritized  actions.  Our  notion  of  priority  in  the  environment  is 
based  on  the  intuition  that  a  low  priority  action  can  proceed  only  if  the  high 
priority  action  cannot  proceed  due  to  lack  of  the  handshaking  partner  at  that  point 
of  execution.  In  other  words,  if  some  action  is  possible  corresponding  to  that 
environment  at  some  point  of  execution  then  the  action  takes  place  without 
unnecessarily  waiting.  It  must  be  clear  that  such  an  approach  clearly  satisfies  the 
requirement  given  in  Lamport  (1985)  that  realistic  priority  specification  should  not 
involve  unnecessary  waiting. 

•  A  compositional  semantic  characterization  of  real-time  distributed  languages  with 
priority  is  presented  and,  thus,  forms  a  basis  for  the  compositional  proof  theories 
of  languages  such  as  Ada  and  Occam. 

The  rest  of  the  paper  is  organized  as  follows:  Section  2  describes  an  abstraction  of 
prioritized  real-time  distributed  concurrency  in  terms  of  an  extension  of  real-time 
communicating  sequential  processes  (CSP-R)  (Koymans  et  al  1988)  and  is  referred  to 
as  CSP-R„;  further,  the  language  syntax  and  informal  interpretation  of  CSP-Rp  is 
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discussed.  Section  3  describes  the  real-time  model  and  the  execution  model,  and  §4 
describes  the  semantic  domain  and  semantic  equations.  Section  5  discusses  parallel 
composition  both  informally  and  formally;  towards,  the  end  of  §  5,  we  discuss  the 
impact  of  various  parameters  including  those  that  affect  maximal  parallelism  on  the 
prioritized  semantics  proposed.  The  paper  concludes  with  a  discussion  of  the  results 
and  the  ongoing  work. 


2.  Language  syntax 

For  ease  of  presentation,  we  use  language  CSP-Rp  (prioritized  real-time  CSP)  instead 
of  Ada.  It  may  be  noted  that  CSP-Rp  is  an  extension  of  CSP-R  described  in  Koymans 
et  al  (1988).  In  Koymans  et  al  (1988),  it  has  been  shown  that  Ada  tasking  and  real-time 
features  can  be  simulated  by  CSP-R.  Further,  CSP-Rp  has  the  priority  features 
corresponding  to  that  of  type  (3)  discussed  above.  In  CSP-Rp  processes  communicate 
via  unidirectional  channels,  and  a  channel  connects  exactly  two  processes. 

D,  W-  stand  for  channel  variables; 
x,  u  -  stand  for  program  variables; 
e  -  stands  for  expressions; 
b  -  stands  for  boolean  expressions; 
p  -  stands  for  priority  expressions. 

Program:  P::  =  S\N 

Statement :  S::  =  x:=  e | skip | g |  wait d\ ;S2|/1|!,‘/1|[N'] 

Guard:  g::=  b\Dlx\  W\e\wsiitd\b; Dlx  by  p|6;  W\e  by  p|b;wait  d 
Alternative:  A::— 

Network:  N::  —  ||  S2 

The  informal  semantics  of  the  language  follows  on  the  lines  of  the  semantics  of 
CSP-R  given  in  Koymans  et  al  (1988)  and  Huizing  et  al  (1987).  It  may  be  observed 
that  in  our  language,  priority  can  be  assigned  to  only  I/O  guards  and  not  for  pure 
boolean  or  delay  guards  since  the  local  priorities  can  be  manipulated  through  the 
boolean  expressions.  Further,  we  assume  that  pure  boolean  guards  have  priority  over 
the  I/O  guards.  Again,  note  that  the  priority  for  local  action/communication  can 
always  be  manipulated  through  the  boolean  parts  of  the  expressions.  For  ease  of 
understanding,  we  provide  an  informal  interpretation  of  those  commands  that  are 
quite  different  from  those  of  CSP  described  in  Hoare  (1978). 

Interpretation  of  alternative  command:  if  none  of  the  guards  is  open,  then  execution 
aborts;  otherwise,  check  whether  there  is  at  least  one  open  pure  boolean  guard  and 
select  one  of  them  nondeterministically.  In  case  there  is  no  open  boolean  guard  but 
there  is  at  least  one  other  open  guard  the  execution  proceeds  in  the  following  way: 
Compute  the  waitvalue  which  is  defined  to  be  infinite  if  there  are  no  open  wait  guards; 
otherwise,  it  is  defined  to  be  the  maximum  of  1  and  the  minimum  of  the  values  of  the 
durations  of  the  open  wait  guards.  Let  us  denote  the  waitvalue  by  d.  As  soon  as  there 
is  a  possible  semantic  match  for  the  open  I/O  command;  the  communication  action 
takes  place  over  the  channel  that  has  the  highest  priority  (the  selection  of  the 
semantically  matching  commands  of  equal  priority  is  nondeterministic).  Flowever,  if 
no  semantic  match  takes  place  within  d  units,  then  one  of  the  open  wait  guards  with 
waitvalue  equal  to  d  is  selected  nondeterministically.  It  may  be  observed  that  we  have 
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assumed  a  priority1  for  the  boolean  guards  over  the  I/O  guards.  Note  that  the  I/O 
guards  can  be  assigned  any  positive  priority;  we  assume  a  default  priority  of  1  in  case 
there  is  no  explicit  specification.  No  explicit  priority  can  be  assigned  to  delay  guards; 
in  other  words,  the  delay  guards  are  assumed  to  have  priority  0.  This  is  consistent 
with  the  semantics  of  Ada. 

For  example,  consider  a  system  consisting  of  three  robots:  P1?  P2 ,  and  P3.  Robots 
P]  and  P2  compete  for  some  resource  service  from  P3  such  that  Pt  or  P2  have  to 
wait  only  linearly  (i.e.  one  can  wait  till  the  other  gets  the  service)  for  service.  The 
controller  for  P3  can  be  abstracted  through  the  following  control  program  in  CSP-Rp: 

P3::i:=  2; *[D?x by l;serve-P,  0  kF?xby3-i-+i:  =  2;serve-P2]. 

Informally,  in  the  program  the  priority  dynamically  switches  from  communication 
over  D  to  that  of  communication  over  channel  W.  It  must  be  noted  that  this  does 
not  mean  that  the  communications  over  D  and  W  alternate.  What  this  means  is  that 
if  this  program  is  placed  in  an  environment  wherein  the  communications  over  both 
D  and  W  are  available,  then  the  net  result  will  be  that  of  alternating  communications 
over  D  and  W.  However,  in  the  context  of  environments  wherein  the  communications 
over  D  and  W  are  not  always  ready,  the  behaviour  differs.  In  other  words,  one  of 
the  important  points  to  be  noted  is  that  programs  with  priority  clause  in  general 
cannot  be  transformed  to  a  program  that  does  not  use  any  priority  clause  unless  the 
environment  is  a  priori  given  directly  or  indirectly.  Note  that  if  several  requests  having 
the  same  priority,  arrive  at  the  same  time,  then  the  choice  is  nondeterministic.  To  be 
more  specific,  we  consider  the  possible  set  of  actions  in  the  following  example. 

PI  ::=  [Pl?xby  1  ->S1 
[]P2?yby2-*S2 
QD3?zby3  ~>S3] 

Let  r0  be  the  time  of  arrival  at  this  select  statement.  Let  us  assume  that  some 
communication  takes  place  over  some  channel  at  time  t1(t1  ^  t0).  If  we  assume  that 
there  is  no  unnecessary  waiting,  then  the  general  condition  is  that  t1  was  the  earliest 
time  at  which  the  communication  could  take  place.  The  possible  interpretations  are 
given  by: 

1.  Communication  over  D 3  takes  place  at  tl;  this  does  not  require  any  other  condition 
as  D 3  is  the  channel  with  the  highest  priority. 

2.  Communication  over  D2  takes  place  at  ty;  this  implies  that  communication  over 
D 3  was  impossible  since  f0. 

3.  Communication  over  PI  takes  place  at  tl;  this  implies  that  communications  over 
D2  and  D3  were  impossible  since  t0. 

It  must  however  be  noted  that  how  to  determine  the  possibility  of  the  communication  or 
not  is  an  implementation  issue  (i.e.  implementation  of  the  synchronous  communication) 
and  is  not  explicit  in  CSP-Rp.  Note  however,  in  Ada  it  is  possible  to  determine  such 
a  possibility  through  the  entry  queues. 


1  One  of  the  reasons  for  this  assumption  is  that  in  our  semantics  for  the  sake  of  simplicity 
we  have  assumed  that  expression  evaluation  takes  no  time.  This  restriction  can  be  removed 
very  easily.  In  fact,  no  generality  will  be  lost  with  such  an  assumption. 
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Hiding,  [iV],  has  no  effect  on  the  execution  of  N  but  changes  what  can  be  observed 
about  such  executions.  In  other  words,  communications  along  channels  in  N  are 
internalized  and  cannot  be  observed  any  more. 


3.  Model  of  real-time  and  execution  model 

The  language  CSP-Rp  is  a  synchronous  language.  We  assume  that  time  proceeds  in 
discrete  time  steps.  This  is  consistent  with  the  argument  given  in  Pnueli  &  Harel 
(1988,  pp.  84-98)  that  integer  time  domain  will  be  appropriate  for  synchronous 
programming  languages,  since  all  the  processes  refer  to  the  same  global  clock  and 
operate  only  at  certain  points  of  time.  Thus,  our  time  domain  is  the  set  of  natural 
numbers.  For  the  sake  of  simplicity,  we  further  assume  that  all  primitive  actions  (such 
as  assignment  and  communication)  take  one  unit  of  time.  Parameterization  with 
respect  to  transmission  time  of  the  network  and  range  of  values  for  actions  (Koymans 
et  al  1988)  are  ignored  for  the  sake  of  simplicity.  In  other  words,  real-time  is  modelled 
by  relating  the  ith  element  of  a  history  with  the  ith  tick  of  a  conceptual  global  clock. 
However,  it  should  be  noted  that  we  do  not  either  imply  the  existence  of  a  global 
clock  or  assume  the  tightness  of  synchronization  of  the  processor  clocks. 

A  real-time  execution  model  is  useful  only  if  we  can  make  some  assumptions  about 
the  progress  each  process  is  due  to  make.  In  this  paper,  we  use  the  maximal  parallelism 
model  described  in  Koymans  et  al  (1988)  (we  use  MAXPAR  as  an  abbreviation).  In  the 
MAXPAR  model,  at  any  instant  of  time  all  actions  that  can  be  started  without  violating 
synchronization  constraints  will  be  initiated.  In  other  words,  we  assume  the  existence 
of  a  processor  for  every  process.  Such  an  assumption  removes  the  need  of  considering 
resource  scheduling.  That  is,  a  process  is  allowed  to  be  idle  only  if  all  communication 
partners  are  unwilling  to  communicate  and  no  local  actions  are  possible  at  that  point. 
Towards,  the  end  of  the  paper,  we  discuss  other  aspects  of  real-time  models. 


4.  Denotational  semantics  of  CSP-Rp 

Semantic  domain 

The  domain  consists  of  non-empty  prefix-closed  set  of  pairs:  each  pair  consisting  of 
a  state  and  a  finite  history  leading  to  this  state.  Infinite  behaviours  are  modelled  by 
their  sets  of  finite  approximations.  In  order  to  enforce  maximal  parallelism,  we  have 
to  record  whether  the  processes  are  suspended,  and  if  so,  on  which  communication 
the  process  is  suspended  etc.  To  enforce  consistency  of  priority  the  semantics  has  to 
encode  the  priority  information  in  a  suitable  manner  so  that  the  semantics  remains 
compositional.  These  aspects  are  discussed  formally  in  the  following. 

The  domain  is,  Ddom  —  P(S  x  H)  where 

•  S  =  Su{_L},  S  being  the  set  of  proper  states  (i.e.  partial  functions  from  Id  (set  of 
identifiers)  to  'V  (set  of  expression  values),  and 

•  H  =  set  of  sequences  of  records  of  the  form:  {A,  G),  where  A  is  a  set  of  communication 
assumption  records  referred  to  as  the  Action  set  and  G  provides  the  partial  order 
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information  with  reference  to  the  action  set  necessary  for  checking  priority 
consistency. 

The  communication  assumption  records  (CAR  for  short)  are  of  the  following  types. 

(1)  Communication  records  of  the  form  (D,  v)  where  D  is  a  channel  name  and  i>eVAL 
(domain  of  values).  If  the  ith  element  is  of  the  form  ( D ,  v),  it  can  be  interpreted  as 
sending  or  receiving  the  value  v  over  channel  D  at  the  ith  tick  of  the  conceptual 
global  clock. 

(2)  Ready  records  of  the  form  i?(,4)  where  A  is  a  subset  of  channel  names.  If  the  ith 
element  of  a  history  is  of  the  form  I?  (A),  it  can  be  interpreted  as  the  willingness 
of  the  process  to  communicate  over  the  channels  in  A  at  the  ith  tick  of  the 
conceptual  global  clock,  and  the  impossibility  of  communication  since  one  of  the 
partners  is  not  able  to  communicate. 

(3)  Internal  moves  (denoted  □).  If  the  ith  element  is  □,  it  can  be  interpreted  as  a 
local  action  at  the  ith  tick  of  the  conceptual  global  clock. 

In  other  words,  the  observable  actions  are:  (a)  communication  actions  with  the 
associated  priority,  (b)  the  time  of  the  observable  actions,  and  (c)  the  state  of  the  time 
of  termination. 

In  the  a  priori  semantics,  we  keep  information  about  the  priority  of  the  various 
actions  in  terms  of  triples  D,  L)  with  the  following  interpretation: 

•  D  is  the  channel  over  which  communication  is  assumed  to  have  taken  place. 

•  H  is  the  set  of  channels  that  have  higher  priority  over  D. 

•  L  is  the  set  of  channels  having  priority  less  than  or  equal  to  that  of  D. 

Note.  Informally,  <//,  D,  L>  has  the  following  interpretation. 

•  Processes  are  ready  to  communicate  over  the  channels  in  Hu{D}u  L. 

•  Communication  takes  over  D;  that  is,  D  is  the  channel  that  has  a  partner  and  there 
is  no  other  channel  that  has  a  priority  higher  than  D  having  a  ready  partner;  H 
is  the  set  of  channels  that  have  higher  priority  over  that  of  0  and  D  has  a  higher 
priority  over  those  of  L. 

Obviously,  sets  H,  {D},  and  L  are  mutually  disjoint. 

Before  describing  the  semantic  domain  and  the  equations,  we  formalise  the  priority 
triples. 

DEFINITION  1 

Consider  the  triple  < H,D ,  L>  where  H  and  L  are  subsets  of  channel  names  and  D  is 
a  channel  name.  Then,  the  triple  < H,D,L )  denotes  the  relation  {(a,D)\aeH}u 
{( D,b)\beL }. 

Note.  (1)  By  the  underlying  graph  of  < H,D,L ),  we  mean  a  directed  graph  ( V,E ) 
where  V  =  Hu{D}  u  L  and  E  is  the  set  of  all  directed  edges  (a,  b)  corresponding  to 
(a,  b)  in  the  underlying  relation.  We  refer  to  these  graphs  are  priority  graphs. 

(2)  Priority  (or  precedence)  graphs  are  said  to  be  inconsistent  if  they  are  not  acyclic. 
We  use  _L  to  denote  inconsistent  priority  graphs. 
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DEFINITION  2 


Let  Gj  and  G2  be  priority  graphs.  Then, 


if  G1uG2,  is  not  acyclic, 
otherwise. 


In  the  following,  we  describe  formally  the  domain. 

Now, 

S  x  H  =  {<<7,  h}\aeS,  heH  and  \h\  <  oo}. 

A  set  .AeSTATE  x  HISTORY  is  said  to  be  prefix-closed  iff  V (o,h)eX,  if  h'  ^  h  then 
<±,h'>eX. 

The  prefix-closure  of  X,  denoted  PFC  (AT),  is  defined  as 

X u {(±, X)}  u {<±,h'y\3(j3h{(a,h}eX  A  h'  ^  h)}. 

The  domain  consists  of  all  nonempty  prefix-closed  elements  of  Ddom.  Note  that  the 
domain  forms  a  complete  lattice  with  set-inclusion  (c);  the  lub  is  obtained  by  u 
(set-union)  and  the  least  element  is  {<2L,  2)}. 

The  meaning  function  is  of  the  form,  M  [Statement]:  S-+Ddom  defining  the  meaning 
of  statements  from  S  to  Ddom.  For  defining  the  meaning  of  alternative  command 
compositionally,  we  define  an  auxiliary  function  G{g,  A]  from  S  to  Ddom  which  gives 
the  meaning  of  the  guard  g  in  the  context  of  a  set  A  of  alternative  guards.  Denoting 
the  set  of  alternative  guards  by  A,  we  get: 


CfeH-S-.B*.. 


Notation.  (1)  Let  h  =  <Al,Gl>o<;42,  G2>o--- o  (An,  Gn>  be  a  finite  history  oflength 
n.  Then,  we  use  h 1  =  AloA2o---oAn  to  denote  the  projection  of  the  history  to  the 
first  component  while  we  use  h2  =  GloG2o---oGn  to  denote  the  projection  of  the 
history  to  the  second  component;  the  length  of  h  is  denoted  by  \h\,  and  the  kth  element 
of  h  is  denoted  by  h[k]. 

(2)  The  length  of  the  history  (trace)  denotes  the  time  taken  for  arriving  at  the  point; 
the  empty  set  is  denoted  by  Note  that  R((p)  also  denotes  □  as  well  as  <</>,  v)  in 
our  notation. 

(3)  For  convenience,  we  use  (o,  A)  to  denote,  (o,  (A,  G)  )  when  G  is  an  empty  graph. 

(4)  If  B  is  a  set  of  histories,  then  we  use  (<cr, B)}  to  denote  {{o,h}\heB}. 

(5)  We  represent  singleton  action  sets  of  the  form  {a},  where  a  is  some  communication 

assumption  record  (i.e.,  <D,  v),  R(A),  or  □),  by  a  itself;  we  also  omit  the  set  symbol 
for  ready  records.  For  example,  {{□}{□, R({D,  IT})} {<£>, u>}}  is  denoted  by 
□  I T)}<D, u).  We  omit  the  concatenation  operator  (o)  whenever  it  is  clear 

from  the  context. 

(6)  Whenever  it  is  clear,  we  do  not  enclose  the  elements  of  the  trace  within  the  angular 
brackets. 

The  semantic  equations  for  the  language  constructs  are  formally  defined  in  appendix 
B;  in  the  following,  we  informally  discuss  features  of  the  parallel  composition. 


Compositional  priority  specification 


83 


5.  Parallel  composition 

Informal  approach 

The  semantics  is  based  on  the  real-time  semantics  discussed  in  Koymans  et  al  (1988). 
The  semantic  domain  consists  of  state-history  pairs.  The  history  is  nothing  but  the 
traces  of  Koymans  et  al  (1988)  enriched  with  the  priority  information.  In  the  following, 
we  informally  discuss  how  priority  consistency  is  ensured  in  the  parallel  composition. 
Our  semantics  has  two  stages. 

•  A  priori  semantics  -  Here,  we  consider  the  semantics  of  each  process  in  an  isolated 
way. 

•  Binding  of  the  processes  -  Here,  the  meaning  of  the  program  is  obtained  by 
considering  the  meaning  of  the  component  processes. 

While  composing  the  processes,  we  check  for  the  mutual  consistency  of  the  isolated 
assumptions  made  in  each  of  the  processes.  As  already  mentioned,  the  semantics  of 
each  process  is  in  the  domain  of  state-history  pairs.  For  example,  let  us  consider  two 
state  history  pairs,  <^sl,hl)  and  <s2,/j2)  in  processes  P1  and  P2,  respectively.  The 
binding  of  the  states  should  be  understood  easily  as  processes  do  not  share  any 
common  variables.  Let  us  look  at  the  merging  of  the  histories.  The  two  histories  are 
said  to  be  consistent  iff: 

(1)  Communication  compatible  -  That  is,  for  every  communication  assumption  over 
some  channel,  say  D,  in  some  process  Pt  there  is  a  corresponding  matching 
communication  partner  in  some  process  P2  at  the  same  time. 

(2)  max  par  consistent  (no  unnecessary  waiting)  -  Check  that  there  is  no  unnecessary 
waiting,  that  is,  histories  do  not  indicate  a  situation  where  both  the  processes  are 
waiting  for  a  communication  that  the  other  process  can  provide.  This  can  be 
verified  by  checking  that  there  is  no  common  ready-record  between  any  two 
histories  at  the  same  time. 

(3)  Priority  consistent  -  Check  that  the  histories  are  priority  consistent,  that  is, 
histories  do  not  indicate  a  situation  wherein  a  lower  priority  request  has  been 
accepted  in  spite  of  the  possibility  of  a  higher  priority  request. 

In  the  following,  we  informally  show  how  priority  consistency  is  ensured;  communica¬ 
tion  compatibility  as  well  as  MAXPAR  consistent  are  essentially  the  same  as  in  Koymans 
et  al  (1988);  the  formal  equations  are  given  in  appendix  B. 

For  the  understanding  of  priority  consistency,  let  us  consider  the  priority  assump¬ 
tions  < H,D ,  L>  and  <//',£>',  L'>  in  the  histories  of  any  two  processes  at  some  time 
t.  If  the  sets  H,  H',  {£>},  {D'},  L,  L'  are  mutually  disjoint  then  priority  consistency 
follows  trivially.  Further,  it  must  be  noted  that  if  the  two  CAR  under  consideration 
are  neither  communication  compatible  nor  MAXPAR  compatible  then  there  is  no  need 
of  a  separate  check  for  priority  consistency.  The  basic  idea  for  establishing  priority 
consistency  is  to  derive  the  underlying  graph  and  check  whether  there  are  inconsistent 
(circular)  precedences.  The  important  aspect  of  the  graph  construction  is  that  it  is 
done  incrementally  and  the  existing  graph  is  augmented  only  if  it  is  not  inconsistent 
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after  augmentation.  In  the  following,  we  analyse  the  relations  among  these  sets  and 
show  how  inconsistent  histories  can  be  removed;  a  separate  priority  check  is  resorted 
to  only  if  the  inconsistency  does  not  follow  either  from  communication  or  MAXPAR 
incompatibility.  For  the  sake  of  informal  reasoning,  we  do  a  case  analysis. 

The  important  cases  are: 

(1) 

•  In  such  a  situation,  clearly  two  processes  are  waiting  unnecessarily,  since  D  has 
a  lower  priority  than  those  of  the  channels  in  HnH'  and  D'  has  a  lower  priority 
than  those  of  the  channels  in  H  n  H'.  In  other  words,  this  is  not  MAXPAR  consistent. 
Thus,  if  we  can  ensure  that  the  histories  are  MAXPAR  consistent,  then  there  is  no 
need  for  checking  again  for  priority  consistency. 

(2)  D'eH 

•  The  situation  corresponds  to  the  situation  of  no  partner  for  communication 
over  D'  -  corresponding  to  communication  incompability;  hence,  there  is  no  need 
for  a  separate  check  for  priority  consistency. 

(3)  D'e  L 

•  This  case  again  can  be  ruled  out  on  the  same  lines  as  case  (2). 

(4)  DeH ' 

•  This  case  again  can  be  ruled  out  on  the  same  lines  as  case  (2). 

(5)  DeL' 

•  This  case  again  can  be  ruled  out  on  the  same  lines  as  case  (2). 

(6)  LnL'*<t> 

•  There  is  no  need  for  checking  priority  consistency  since  by  definition,  we  assume 
that  communication  does  not  take  place  over  low  priority  channels. 

(7)  HnLV</)AD  =  D' 

•  Consider  the  precedence  graph  shown  in  figure  1  (in  the  graph  -*■  denotes  higher 
priority  than)  for  this  case.  It  can  be  easily  seen  that  the  priorities  are  assigned 
inversely  in  the  two  processes  on  at  least  one  common  channel  -  hence,  inconsistent 
(the  inconsistency  can  be  seen  due  to  the  cycle  in  the  graph). 

(8)  Ln //'#<£  A  D  =  D' 

•  Inconsistency  in  this  case  follows  from  the  previous  case. 

(9)  H  n  L'  ^  (j>  A  H'  n  L  =  (p 

•  From  Hr\  L'  i^cf)  it  follows  that  the  priority  of  D '  is  greater  than  or  equal  to 
that  of  the  common  channels  of  H  and  L'.  By  considering  the  second  conjunct, 
we  can  conclude  that  D  must  have  a  higher  priority  than  that  of  D'.  In  other 
words,  there  is  a  cycle  and  hence,  inconsistent  (see  the  priority  graph  shown  in 
figure  2): 

(10)  LnH'*4>  A  Hn  L' 

•  The  consistency  can  be  ensured  in  the  same  way  as  the  previous  case. 

In  the  above  analysis,  we  have  considered  only  the  important  situations;  the  other 
situations  can  be  considered  in  a  similar  way.  The  exact  way  of  keeping  track  of  the 
information  will  be  clear  from  the  formal  set  of  semantic  equations.  The  equation  for 
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Figure  1.  Priority  graph  for 
case  (7). 
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Figure  2.  Priority  graph  for  case  (9). 


parallel  composition  is  given  below: 


M[S,liS2]a  = 


PFC({<u,  X  <r2,h\#h12jom(hf,hl)y\Vi€{l,2}:(ol,hi'>eMlSl}c 

A  consistent(ol,hl,o2,h2)}), 


where 


crj  x  o2(x)  =  oyOc)  if  oq,  <j2  #  J_  A  xevar(Oi) 

=  o{x)  if  au<r2  MAx^  var(oi)  A  xevar(o) 

—  J_  otherwise. 

The  pointwise  merging  of  the  histories  h1#h2  and  the  predicate  consistent  are  defined 
below:  Let  cset  be  the  set  of  common  channels  in  Sx  and  S2.  Then,  the  predicate 
consistent  is  given  by, 

consistent(h1,h2,cset)A  Comm(hl,h2,cset)  A  NW(h1,h2,cset)  A  Pri(h1,h2,cset) 


That  is,  hx  and  h2  are  said  to  be  consistent  iff  they  are  communication  compatible 
(checked  by  predicate  Comm),  there  is  no  unnecessary  waiting  (checked  by  predicate 
NW )  and  priority  consistent  (checked  by  predicate  Pri ).  Note  that  the  consistency 
is  checked  with  reference  to  the  joint  channels.  Each  of  these  predicates  is  defined 
below: 

•  Comm(h1,h2,cset)  A 

VI  <  j  ^  maxd/ii  |, \h2\),ve  Val,Decset:  (D,v}eh{\_j]  iff  \D,  v')eh2[j~\. 

That  is,  for  every  communication  on  the  joint  channel,  there  is  a  reciprocal 
communication  in  the  other  process  at  that  point  of  time  with  respect  to  some 
priority  assumptions. 

•  NW{h1,h2,cset)A  VI  ^ ^  min(\h1 1, \h2\):  R{A)sh[  [y]  A  R(B)eh\[j]  ->  A  nB  —  f. 
That  is,  processes  are  not  unnecessarily  waiting  for  each  other. 

•  Pn' (/?!,/!  2,  cset)  A 

VI  <y  ^  min(\h1\,\h2\),G1ehl[j]  A  G2eh22[_jf 
joinjG^Gj)  is  an  acyclic  (precedence)  graph. 

Pointwise  merging  of  histories:  Let  hx  and  h2  be  consistent  with  reference  to  the  set  of 
joint  channels  cset.  Then,  hl#h2  is  defined  by  the y'th  pointwise  union  as  follows: 

,G>,  where: 

(1)  A  =  {<D,v>\j^\hl\Vj^\h2\  A  (D,v}eh\in  V  <D,v)ehl2M}. 

•  If  the  communication  is  over  a  channel  not  belonging  to  cset  (the  length  of  the 
two  traces  may  not  be  equal),  then  the  record  is  kept  without  an,y  change;  otherwise, 
both  traces  must  contain  the  same  communication  record  (D,  v)  at  the  time  point 
j  (as  required  by  Comm). 

u  {R({B|;  « 1)1,1  V  j  « \h2\  A  (R(D)e/.![;]  V  R(D)eli‘  [;])})). 
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•  Obtain  the  new  ready  set  of  channels  by  taking  the  union  of  all  the  channels  over 
which  the  processes  are  waiting;  note  that  there  is  no  need  to  check  whether  the 
channel  is  in  cset  or  not  as  the  consistency  test  has  already  assured  (by  NW )  that 
there  is  not  unnecessary  waiting;  internalizing  the  channel  in  cset  is  handled  by 
the  hiding  rule. 

(2)  G  =  join  (hi,  hj). 

Before  defining  the  equation  for  hiding,  we  define  hide:  graphs  x  channels  -*■ 
{-L}  vj  graphs. 

V 

DEFINITION  3 

Let  Gx  and  G2  be  any  two  priority  graphs  that  are  consistent  (i.e.,  precedence)  and 
cset  be  some  non-empty  finite  set  of  channels.  Then,  we  define, 

joincset(Gl,G2 )  =  hide(join(Gl,G2),  cset). 

DEFINITION  4 

Let  G  be  some  priority  consistent  graph  and  cset  be  some  finite  non-empty  set  of 
channels.  Then, 

(1)  hide(L,cset )  =  X. 

(2)  hide{G,  </>)  =  G. 

(3)  VDecset:hide(G,cset)  =  hide{G',cset  —  { D })  where 

G'  =  {( a,b)eG\D  is  not  in  {a,b}}  {(a,b)\(a,  D)eG  A  (D,b)eG}. 

Hiding  Let  cset  be  the  set  of  internal  channels  of  S.  Then, 

M  [ [S] ]  a  =  PFC(  { <a,  (A,  G>  >  1 3  <A\  G'  >:  <<x,  <A\  G' » eM  [Sj  a 

A  A  =  A'  |  cset  A  G  —  hide (G',  cset) } ) 

where  A"\cset  is  the  history  obtained  after  removing  all  the  communications  and 
readies  on  channels  cset  from  A'.  Note  that  the  empty  set  is  represented  by  □  and 
hence  the  time  points  are  preserved. 

In  the  following,  we  sketch  proofs  of  theorems  for  establishing  the  soundness 
(priority  consistency)  and  the  compositionality  of  the  semantics.  The  following 
theorem  establishes  the  priority  consistency. 

Theorem  1.  Consider  an  alternative  command  [  fl "=  x  gt  -*•  SJ.  Let  the  process  be  enabled 
on  channels  d1,d2,...,dm(m^n)  with  increasing  order  of  priority  at  time  t2.  Then, 
communication  started  over  dj  started  at  t2{t2^tl)  implies  that  communication  over 
dj+1, . . .  ,dm  was  impossible  during  to  t2  and  communication  over  dj  was  impossible 
during  t{  to  t2  —  1. 

Proof.  The  process  was  enabled  on  channels  d1,d2,...,dm(m^n)  at  time  tt  implies 
that  the  traces  have  the  structure  ccR^fJf where  a  and  /?  are  some  traces  upto 
tx  and  after  t2  —  1  respectively.  It  must  be  noted  that  fi  would  have  at  least  one 
non-wait  action  in  its  first  component.  Thus,  by  the  predicates  NW  and  Pri  it  follows 
that  this  was  the  earliest  possible  action.  Note  that  the  condition  also  holds  when 
we  consider  hiding.  Hence  the  theorem. 
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Proof.  Associativity  of  the  parallel  composition  ignoring  the  priority  information 
follows  on  the  lines  of  the  proof  given  for  CSP-R  (Koymans  et  al  1988).  What  we  need 
to  show  is  that  the  composition  of  the  priority  information  also  satisfies  the  property 
of  associativity.  In  other  words,  we  have  to  prove  that  join(join(Gl,G2),G3)  = 
join(Gl,join(G2,G3)).  This  follows  from  the  fact  that  the  union2  of  relations  is 
associative. 

Theorem  3.  The  semantics  is  priority  consistent  and  compositional. 

Proof.  Proof  follows  from  theorem  1  and  theorem  2. 

Illustrative  example.  In  the  following,  we  consider  a  simple  example  that  illustrates 
the  possibility  of  a  deadlock  in  a  network  consisting  of  n  processes,  P1,P2,...,P„, 
for  some  given  priority.  For  the  sake  of  brevity,  we  ignore  the  values  sent  and  received 
in  the  example.  In  the  example,  we  consider  only  two  processes  P1#  and  P2  which 
have  {/?,y}  as  the  common  set  of  channels.  The  interesting  feature  of  the  example  is 
that  it  depicts  how  the  two  processes  get  into  deadlock  for  the  given  assignment  of 
priorities  irrespective  of  the  behaviour  of  the  other  processes. 

Example  program 

Pl::  =  [a?  by  1  -►  P2::  =  [co!  by  1  -> 

0/??  by  2->  Qy?  by  2-> 

0y!  by  3  ->•]  DyS!  by  3 ->] 

M[PI]<7  =  PFC({ <cr', Hx})  where  77  x  consists  of 

{{£(y.M}*o<{R(y,|8),<a,?>},<{y,JJ},a,4>>>, 

{R(y,fta)}*o<{R(y),  </),?>},  G3>, 

{K(y,/),a)}*o<<ft?>,G4>}, 

where  G3  =  <  {y},  fi.  {a} )  and  G4  =  (<p, y,  {/?,  a} )  and  “?”  denotes  that  the  value  part 
is  ignored. 

Af[P2]cr  =  pfc({<<t', H2}})  where  H2  consists  of, 

{{/?(/?,  y,  co)}*  o  <  {R{/3,  y),  <co,  ?  > },  <  {P,  y},  co,  </>>  >, 

{P()5,y,co)}*o<{P(A<y,?>},G7>, 

{P(^,y,co)}*o<a?>,G8>}, 

where  G1  -  <  {/?},  y,  {co}  >  and  G8  =  <0,  p,  {y,  co}  >. 

The  parallel  composition  is  given  by 

M[Pl||P2]a  =  {<l,A>}. 

Explanation.  For  the  sake  of  brevity,  we  consider  only  those  histories  that  become 
inconsistent  due  to  inconsistency  of  priority.  The  common  set  of  channels  of  PI  and 


2  Note  that  the  operation  join  is  associative. 
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P 2  is  given  by  cset  —  {ft  y}.  Most  of  the  histories  become  inconsistent  due  to  predicates 
Comm  and  NW.  The  following  pairs  of  histories  get  eliminated  due  to  inconsistent 
priorities,  that  is,  both  join(G3,G8)  and  join(G4,  G7)  are  1.  Let  us  take  a  closer  look 
at  how  these  two  pairs  of  histories  become  inconsistent. 

h3  =  < {R(y),  </?,?>},<{y}, ft  {a} > >; hs  =  < <ft ? >, <0, ft  {y, co} > >. 

Obviously,  in  G3,  y  has  priority  over  ft  while  in  G8,  /?  has  priority  over  y.  Thus, 
there  is  an  inconsistent  partial  ordering.  Now,  consider  the  pair  of  histories, 

**  s  ?  >,  <6  y,  W  « )»;*,  =  (  {R(P),  <y.  ?>},<{«.  y,  M  >  >. 

Again  here,  y  has  priority  over  /?  (in  G3),  while  in  G7,  /?  has  priority  over  y  -  that 
is,  the  ordering  is  inconsistent. 

In  other  words,  the  two  processes  get  deadlocked  in  the  beginning  itself  despite 
the  behaviour  of  other  processes  in  the  network.  Thus,  the  two  processes  do  not  do 
anything. 

Real-time  execution  models 

In  the  previous  sections,  we  have  discussed  semantic  specification  of  priority  using 
the  maximal  parallelism  model.  In  this  section,  we  briefly  discuss  the  effect  of  various 
parameters  including  those  that  affect  maximal  parallelism. 

•  It  easily  follows  that  ignoring  priority  leads  essentially  to  the  same  semantics  as 
in  Koymans  et  al  (1988). 

•  In  Koymans  et  al  (1988)  a  spectrum  of  models  ranging  from  interleaving  to  maximal 
parallelism  has  been  given  accounting  for  the  communication  media  and  a  range 
in  timings  for  actions.  The  semantics  described  here  can  also  be  augmented  on  the 
same  lines  to  account  for  the  various  parameters.  However,  it  may  be  noted  that 
in  the  case  of  the  interleaving  model  the  semantics  no  longer  ensures  Lamport’s 
requirement  (cf.  Lamport  1985)  that  there  should  not  be  any  unnecessary  waiting 
in  realistic  priority  specification-,  this  is  in  conformity  with  Lamport’s  conjecture. 

•  Though  the  principle  of  one  processor  to  one  logical  process  is  quite  feasible,  there 
are  many  situations  which  force  resource  restrictions  either  due  to  logical  design 
(for  example,  recursive  processes)  or  due  to  physical  constraints  of  space.  The  need 
of  resource  restrictions  leads  to  scheduling  requirements.  Thus,  the  semantics  should 
be  able  to  handle  interrupts  of  statements  with  higher  priority.  One  posible  solution 
is  to  combine  the  approaches  of  this  paper  with  that  of  Hooman  (1989). 

•  Another  aspect  that  one  comes  across  in  realistic  situations  is  that  of  assumptions 
about  bus-arbitration  or  in  general  fairness  issues.  Perhaps  one  can  handle  some 
of  these  issues  in  a  limited  way  similar  to  that  of  scheduling;  a  thorough  investigation 
is  needed  to  tackle  the  issue  of  fairness  in  the  context  of  compositional  semantics. 


6.  Discussion 

In  this  paper,  we  have  developed  a  compositional  denotational  semantics  for 
prioritized  real-time  distributed  programming  languages.  One  of  the  interesting 
features  is  that  it  extends  the  compositional  theory  proposed  in  Koymans  et  al  (1988) 
for  prioritized  real-time  languages  preserving  the  compositionality  of  the  semantics. 
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As  mentioned  already,  the  language  permits  users  to  define  situations  in  which  an 
action  has  priority  over  another  action  without  the  requirement  of  preassigning 
priorities  to  actions  for  partially  ordering  the  alphabet  of  actions.  These  features  are 
part  of  the  languages  such  as  Ada  designed  specifically  keeping  the  needs  of  real-time 
embedded  systems. 

Our  approach  does  not  have  the  restriction  (as  in  Cleaveland  &  Hennessy  1990) 
that  only  prioritized  internal  moves  can  pre-empt  unprioritized  actions.  Our  notion 
of  priority  in  the  environment  is  based  on  the  intuition  that  a  low  priority  action  can 
proceed  only  if  the  high  priority  action  cannot  proceed  due  to  lack  of  the  handshaking 
partner  at  that  point  of  execution.  In  other  words,  if  some  action  is  possible 
corresponding  to  that  environment  at  some  point  of  execution  then  the  action  takes 
place  without  unnecessary  waiting.  It  must  be  clear  that  such  an  approach  clearly 
satisfies  Lamport’s  requirement  (Lamport  1985)  that  realistic  priority  specification 
should  not  involve  unnecessary  waiting.  The  condition  of  “no  unnecessary  waiting” 
itself  provides  a  sort  of  priority  for  unprioritized  actions3  and  thus,  provides  a  natural 
model  for  the  tasking  features  of  Ada.  In  the  semantic  theory  we  have  proposed  there 
is  a  clear  distinction  between  the  semantic  model  and  the  execution  model  -  this  has 
enabled  us  to  fully  ensure  that  there  is  no  unnecessary  waiting.  It  is  of  interest  to 
note  that  the  real-time  semantics  proposed  satisfies  Lamport’s  conditions  in  a  natural 
way.  Also,  our  work  is  the  first  formal  semantics  for  treating  priority  that  is 
state-based  -  thus,  having  advantages  over  algebraic  approaches  for  reasoning  about 
reactive  systems. 

We  believe  that  the  proposed  compositional  theory  provides  a  sound  basis  for  the 
languages  for  programming  reactive  systems  (see  Liu  &  Shyamasundar  1989).  It  may 
be  observed  that  we  have  developed  the  semantics  by  considering  the  priorities  of 
events  in  each  process  in  an  isolated  manner.  Thus,  it  is  only  the  partial  ordering  of 
the  events  that  is  important  rather  than  the  priority  number  associated  with  the 
events.  We  have  used  prefix-closed  sets  as  our  semantic  domain  in  order  to  treat  any 
general  reactive  system. 

In  our  semantics,  we  have  given  priority  for  pure  boolean  guards  so  as  to  be 
consistent  with  the  semantics  defined  in  Koymans  et  al  (1988).  This  restriction  can 
be  removed  very  easily  by  appropriately  changing  the  a  priori  semantics  of  the 
alternative  command.  The  theory  can  also  be  extended  to  include  priorities  of  types 
(1)  and  (2)  discussed  earlier.  Furthermore,  the  semantics  can  be  tailored  to  terminating 
systems  by  considering  complete  traces  instead  of  prefix-closed  sets.  The  semantic 
equations  can  be  easily  extended  for  this  case  (the  main  difference  will  be  that 
we  would  be  using  greatest  fix  point  rather  than  the  least  fix  point  for  iteration).  It 
may  be  noted  that  from  the  real-time  semantics,  one  can  obtain  a  temporal  logic 
proof  system  on  the  lines  of  Hooman  &  Widom  (1988).  A  static  analysis  of  CSP-Rp 
programs  on  the  lines  of  the  characterization  in  Liu  &  Shyamasundar  (1988, 
pp.  1 34-138,  1990)  can  be  used  for  deriving  tools  for  the  specification  and  verification 
of  prioritized  finite  state  systems.  Currently,  we  are  investigating  the  applicability  of 
the  theory  to  the  verification  of  communication  protocols  including  complex  protocols 
such  as  carrier  sense  protocols  discussed  in  Pnueli  &  Hard  (1988,  pp.  84-98). 


3 This  should  not  be  misunderstood  as  the  priority  model  proposed;  this  only  shows  that 
the  maxpar  execution  model  has  some  additional  advantages  in  the  context  of  priority. 
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Appendix  A 

Consider  the  problem  of  minimizing  the  head  movement  in  disc  access  which  requires 
a  different  scheduling  rather  than  the  usual  FIFO  discipline  of  Ada.  We  briefly  discuss 
the  solution  described  in  Gehani  (1983)  using  the  strategy  of  families  of  entries.  Let 
us  assume  that  the  requests  for  service  are  classified  into  three  categories  declared  as 
type  REQUEST-LEVEL  is  (URGENT,  NORMAL,  LOW). 

Urgent  requests  are  accepted  before  any  other  kind  of  requests.  Normal  requests  are 
accepted  only  if  there  are  no  urgent  requests  pending.  Finally  requests  in  the  low 
category  are  accepted  only  if  there  are  no  urgent  or  normal  priority  requests  pending. 
Within  each  category  requests  are  accepted  in  FIFO  order. 

This  scheme  is  implemented  by  a  task  SERVICE  that  contains  the  declarations  of 
an  entry  family  REQUEST: 

task  SERVICE  is 

entry  REQUEST  (REQUEST-LEVEL)  (D:  in  out  DATA); 
end  SERVICE; 

Each  member  of  REQUEST  handles  one  request  category.  The  body  of  task  SERVICE 
is  given  below: 

task  SERVICE  is 

begin 

loop 

select 

accept  REQUEST  (URGENT)  (D:  in  out  DATA)  do 
. . .  processtherequest 
end  REQUEST; 

or  when  REQUEST  (URGENT)’COUNT  =  0=> 

-  the  number  of  tasks  waiting  at  an  entry  is 

-  given  by  the  COUNT  attribute 

accept  REQUEST  (NORMAL)  (D:  in  out  DATA)  do 
...process  the  request 
end  request; 

or  when  REQUEST  (URGENT)’COUNT  =  0  A 
REQUEST  (NORMAL)’COUNT  =  0=> 

-  the  number  of  tasks  waiting  at  an  entry  is 

-  given  by  the  COUNT  attribute 

accept  REQUEST  (LOW)  (D:  in  out  DATA)  do 
...process  the  request 
end  REQUEST; 

end  select; 
end  loop; 

end  SERVICE 


Appendix  B 

Before  describing  the  semantic  equations,  we  will  define  the  auxiliary  functions 
required. 
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Let  <f>  be  a  function  from  S  to  Ddom.  Then  <f)'  is  the  function  from  S  to  Ddom  defined 
by  (j)'(cr)  =  if(j eS  then  </>(<r)  else  PFC({<er,  A)}). 

Further,  </>*  is  the  function  from  Ddom  to  Ddom  defined  by, 

4>*(X)  =  {(a\hoh'}\((j,hyeX  and  (ex', 

The  function  G  is  defined  using  the  following  two  auxiliary  functions. 

r({b1;gl,...,bH;9n},*)  =  {R(D)\3i:  =  «  A  (gt  =  Die  V  gt  =  DU)}, 

(  0,  ifg  =  AA  W\b\a  =  tt, 

waitvalue{b;g,a)  =  <  max(n,  1),  if  g  —  wait  n  A  Wfbjc  =  tt, 

oo,  otherwise. 

A  priori  semantics 

M[5]  1  =  {<1, A)}  for  any  S;  A  is  the  empty  sequence. 

M{skip}a  =  PFC({<a,  {□}>}). 

M\x\~  e\o  =  PFC({<a['A'[e]<r/x],  {□}>}) 

•  where  If  is  the  semantic  function  (assumed  to  be  given)  for  evaluating  the  arithmetic 
expressions. 

I/O  statements 

M {Dlx | a  —  PFC({  <<x [v/xfi < R(DY o(D,v}))\ve  Val,  t  ^  0} ). 

The  semantics  corresponds  to  indefinite  waiting  till  the  communication  succeeds. 

M{Dle}o  =  PFC({(o,(R(Dyo<D,r{e}e)))\t>0}). 

The  semantics  of  guards  is  defined  in  terms  of  an  environment  of  boolean,  I/O  and 
wait  guards.  We  do  not  give  the  semantics  of  the  wait  statement  as  it  follows  from  the 
semantics  of  the  wait  guard  (assuming  empty  environment). 

M[g]]<x  =  G  [g,  </>]  <r,  where  g  is  an  I/O  command, 

G\b,  A\(j  =  if  fF[h]cr  then  PFC{<a,  A)}  else  {<_L,  A)}  fi, 

•  where  W  is  the  semantic  function  (assumed  to  be  given)  for  evaluating  the  boolean 
expressions. 

Gfwait  d,Aja  = 

PFC({<er,  T(A,  of  }\max  {i^  Ifija,  1}  =  minwait(Avj  {waitd},o)A  t}), 

•  where  minwait  (A,  a)  gives  the  minimum  of  the  waiting  periods  corresponding  to 
elements  of  A. 

G[D?x,>%  = 

PFC({  <a [u/x],  T(GRDS,  ofo({R{HI(D,  A)),  (D,  v}}, 

BPG(D ,  GRDS)}}\ve  V,0^t<  minwait  {A,  <x)}), 
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•  where  GRDS  =  Au  {D'lx  by  p},  HI(D,A )  returns  the  set  of  channels  in  A  which 
have  higher  priority  than  channel  D,  EL(D,GRDS)  returns  the  set  of  channels  in 
GRDS  which  have  equal  or  lower  priority  than  channel  D,  and 

BPG{I>, GRDS)  =  < HI(D ,  GRDS),  D,  EL(D ,  GRDS)}. 

G[D\e,Ajo  = 

PFC({  <<7,  r (GRDS, ct)‘o<{R(E[I(D,  A)),  (D,  r{e\c}},  BPG(D,  GRDS)}} 

|0  ^  t  <  minwait(A,  a)}), 

•  where  GRDS  =  A  u  {Die  by  p}.  The  other  interpretations  remain  the  same  as  in 
the  case  of  input  guard. 

G\b\g,  Aja  =  G{g,  A]*(G[b,  Aja),  where  g  =  Dlx  by  p  or  Die  by  p  or  wait  d. 

if  V"=1  fT[^]<r(where  g )  is  the  boolean  part  of  gs)  then 
uJ=iM {S J* (G {gj,  {gk \  1  ^  k  ^  n,  k  #; } ] a) 
else  pfc({<±,2>})  fi 

M  [*  i4|  a  =  g(j).Xa. if  3 i\  =  then  else  PFC({<a,2)})fi. 
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Ecole  des  Mines,  Sophia-Antipolis,  06565  Valbonne,  France,  and 
Digital  Equipment,  Paris  Research  Laboratory,  85,  Av.  Victor  Hugo,  92 
Rueil  Malmaison,  France 

Abstract.  ESTEREL  is  a  synchronous  concurrent  programming  language 
dedicated  to  reactive  systems  (controllers,  protocols,  man-machine  inter¬ 
faces  etc.).  ESTEREL  has  an  efficient  standard  software  implementation 
based  on  well-defined  mathematical  semantics.  We  present  a  new  hardware 
implementation  of  the  pure  synchronization  subset  of  the  language.  Each 
program  generates  a  specific  circuit  that  responds  to  any  input  in  one 
clock  cycle.  When  the  source  program  satisfies  some  statically  checkable 
dynamic  properties,  the  circuit  is  shown  to  be  semantically  equivalent  to 
tbe  source  program.  The  hardware  translation  has  been  effectively 
implemented  on  the  programmable  active  memory  PERLEO  developed  by 
J  Vuillemin  and  his  group  at  Digital  Equipment. 

Keywords.  Pure  ESTEREL;  synchronous  programming  language;  reactive 
systems;  hardware  implementation;  mathematical  semantics. 


1.  Introduction 

ESTEREL  (Berry  &  Cosserat  1984;  Berry  et  al  1988;  Berry  &  Gonthier  1988; 
Boussinot  &  de  Simone  1991)  is  a  synchronous  programming  language  devoted  to 
reactive  systems,  that  is,  to  systems  that  maintain  a  continuous  interaction  with  their 
environment  by  handling  hardware  or  software  events.  Its  software  implementation 
is  currently  used  in  industry  and  education  to  program  software  objects  such  as 
real-time  controllers,  communication  protocols  (Berry  &  Gonthier  1989;  Murakami  & 
Sethi  1990),  man-machine  interfaces  (Clement  &  Incerpi  1989),  systems  drivers  etc. 
In  this  paper,  we  present  a  hardware  implementation  of  the  pure  synchronization 
subset  of  the  language  that  builds  a  specific  circuit  for  each  program.  We  prove  the 
correctness  of  this  implementation  w.r.t.  the  mathematical  semantics  of  the  language 
under  some  conditions  to  be  satisfied  by  the  source  program.  We  describe  the 
experiments  made  so  far  and  the  possible  uses  of  the  hardware  implementation. 

1.1  The  perfect  synchrony  hypothesis 

ESTEREL  is  an  imperative  concurrent  language  with  very  high-level  control  and  event 
manipulation  constructs.  It  is  based  on  a  perfect  synchrony  hypothesis  (Berry  & 
Benveniste  1991),  which  states  that  control  transmission,  communication,  and 
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elementary  computation  actions  take  no  time,  or,  in  other  words,  that  the  program 
is  conceptually  executed  on  an  infinitely  fast  machine.  The  control  structures  include 
sequencing,  testing,  looping,  concurrency,  and  a  powerful  exception  mechanism  which 
is  fully  compatible  with  concurrency,  unlike  in  asynchronous  concurrent  programming 
languages  (Berry  1989).  The  primitive  communication  device  between  concurrent 
statements  is  instantaneous  broadcasting  of  signals. 

The  perfect  synchrony  hypothesis  is  shared  by  the  synchronous  data-flow  languages 
LUSTRE  (Caspi  et  al  1987;  Halbwachs  1991)  and  SIGNAL  (Gauthier  et  al  1987; 
Le  Guernic  et  al  1991).  It  makes  programming  very  modular  and  flexible,  and  it  makes 
it  possible  to  reconcile  input-output  determinism  and  concurrency.  This  is  a  great 
benefit  over  classical  asynchronous  languages  such  as  OCCAM  or  ADA  that  are 
inherently  non-deterministic,  a  characteristic  that  makes  reactive  programming  and 
debugging  needlessly  difficult,  see  Berry  (1989). 

ESTEREL  is  rigorously  defined  by  well-analysed  mathematical  semantics,  given  in 
both  denotational  and  operational  styles  (Berry  &  Gonthier  1988;  Gonthier  1988). 

1.2  ESTEREL  in  software 

The  standard  ESTEREL  compiler  is  directly  based  on  one  of  the  mathematical 
semantics.  It  uses  sophisticated  algorithms  to  translate  a  concurrent  reactive  program 
into  an  equivalent  efficient  sequential  automaton  that  can  be  implemented  in  any 
conventional  language.  Concurrency  is  compiled  away  during  this  process.  The 
resulting  automaton  can  be  directly  run  by  actual  applications.  In  addition  to  the 
compiler,  the  ESTEREL  environment  includes  sophisticated  tools  such  as  symbolic  or 
graphical  simulators  and  interfaces  to  automata-based  program  verification  systems 
such  as  AUTO  (Boudol  et  al  1990). 

1.3  ESTEREL  in  hardware 

Since  many  CAD  systems  directly  support  automata-like  specifications,  ESTEREL 
programs  can  be  implemented  in  hardware  by  first  translating  them  into  automata 
using  the  standard  compiler.  However,  this  indirect  translation  loses  most  of  the 
source  concurrent  structure.  This  is  usually  a  good  idea  in  software,  where  run-time 
concurrency  is  in  fact  expensive,  but  not  in  hardware,  where  concurrency  is  free  and 
should  be  used  as  much  as  possible.  Furthermore,  there  is  no  simple  relation  between 
the  source  program  size  and  the  size  of  the  generated  automaton.  In  the  worst  case, 
the  automaton  size  can  be  exponential  in  the  source  program  size,  and  square  factors 
are  not  rare.  Again,  this  is  much  more  acceptable  in  software  than  in  hardware. 

The  direct  hardware  implementation  we  present  here  is  conceptually  much  better; 
it  is  based  on  Gonthier’s  (1988)  semantic  analysis  of  ESTEREL.  It  transforms  each 
program  into  a  digital  circuit  that  exactly  reflects  the  source  concurrency  and 
communication  structure.  The  circuit  computes  the  response  to  any  input  within 
exactly  one  clock  cycle,  however  complex  the  program  is.  The  translation  is  purely 
structural  (compositional)  and  linear  in  size.  However,  it  is  at  present  limited  to  the 
pure  synchronization  subset  of  the  language,  which  we  call  PURE  ESTEREL,  and  it 
works  only  under  some  restrictive  conditions  to  be  satisfied  by  the  source  program. 

The  translation  is  completely  formalized  and  proved  correct  w.r.t.  the  mathematical 
semantics  under  the  above  restrictive  conditions.  Correctness  relies  on  the  fact  that 
perfect  synchrony  does  not  depart  very  much  from  digital  circuit  synchrony:  zero-time 
is  simply  replaced  by  one  cycle. 
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1.4  Actual  implementation  and  applications 

The  translation  from  programs  to  circuits  has  been  implemented  within  the  existing 
ESTEREL  compiler.  We  have  run  very  successful  experiments  using  the  XILINX7  M-based 
perleO  programmable  coprocessor  developed  at  DEC  Paris  Research  Laboratory  by 
Vuillemin  and  co workers  (Bertin  et  al  1989;  Shand  et  al  1990). 

We  are  currently  investigating  two  kinds  of  applications: 

•  Implementing  existing  ESTEREL  programs  in  hardware  to  match  high  performance 
constraints.  For  example,  we  have  directly  implemented  the  kernel  of  a  fast  local 
area  network  protocol  that  was  developed  in  ESTEREL  at  INRIA  (Mejia  Olvera 
1989). 

•  Programming  hardware  controllers  in  esterel.  The  language  turns  out  to  be 
well-adapted  to  programming  the  control  part  of  a  circuit,  which  is  known  to  be 
difficult  and  error-prone  with  usual  techniques.  We  show  a  toy  example  in 
appendix  A. 

The  fact  that  the  language  can  be  implemented  either  in  software  or  in  hardware 
is  useful  in  two  respects:  one  can  use  the  software  programming  environment  to 
develop,  debug,  and  verify  the  programs;  one  can  experiment  various  trade-offs 
between  hardware  and  software  without  changing  the  source  code. 

1.5  ESTEREL  and  LUSTRE 

The  LUSTRE  synchronous  language  has  also  been  implemented  on  hardware  at  DEC 
prl,  and  the  implementations  of  esterel  and  LUSTRE  are  fully  compatible1.  It 
has  to  be  noted  that  both  languages  differ  from  most  existing  hardware  description 
languages  by  the  fact  that  they  deal  only  with  behaviors  and  not  with  hardware 
objects,  and  also  by  the  care  with  which  they  were  mathematically  defined  and  studied. 
The  describe  circuits,  LUSTRE  and  ESTEREL  are  complementary:  LUSTRE  is  well-adapted 
to  data  path  description,  ESTEREL  is  well-adapted  to  control  automata. 

1.6  Structure  of  the  paper 

Section  2  presents  the  PURE  ESTEREL  language  and  its  intuitive  semantics.  We  give 
enough  material  for  the  paper  to  be  self-contained,  but  not  to  fully  understand  the 
ESTEREL  programming  style,  referring  to  Berry  et  al  (1988),  Berry  &  Gonthier  (1989) 
and  to  the  ESTEREL  documentation  for  these  aspects.  The  mathematical  semantics  of 
PURE  ESTEREL  is  given  in  §  3.  Section  4  presents  an  essential  part  of  the  theory  of 
ESTEREL,  the  coding  of  states  by  haltsets.  This  coding  is  the  root  of  the  hardware 
translation,  whose  principle  is  presented  by  examples  in  §  5.  The  translation  is  then 
formalized  in  §6  and  proved  correct  in  §7.  We  discuss  the  actual  implementation  on 
PERLEO  in  §  8  and  conclude.  An  appendix  gives  the  example  of  a  simple  bus  interface 
and  briefly  analyzes  the  adequacy  of  ESTEREL  to  program  hardware  controllers. 


A  by-product  of  our  work  is  a  translator  from  PURE  esterel  into  lustre. 


98 


G  Berry 


2.  PURE  ESTEREL 

We  first  present  signal  and  events  which  are  the  basic  objects  manipulated  by  PURE 
ESTEREL  programs.  We  then  present  the  kernel  language  on  which  the  semantics  is 
defined  and  the  full  language  that  includes  kernel-definable  user-friendly  statements. 

2.1  Signals  and  events 

PURE  ESTEREL  deals  with  signals  S,  S^...  and  with  events  E,  that  are  sets  of 

simultaneous  signals.  A  signal  that  belongs  to  an  event  is  said  to  be  present  in  that 
event,  otherwise  it  is  said  to  be  absent. 

The  execution  of  a  program  associates  a  sequence  of  output  events  with  any  sequence 
of  input  events.  The  program  repeatedly  receives  an  input  event  E f  from  its  environment 
and  reacts  by  building  an  output  event  £•.  That  and  E\  are  synchronous  is  expressed 
by  the  fact  that  any  external  observer  observes  a  single  event  £fu  £■.  This  is  in 
particular  true  of  any  other  program  placed  in  parallel. 

The  production  of  an  output  event  from  an  input  event  is  called  a  reaction.  The 
flow  of  time  being  entirely  defined  by  the  sequence  of  reactions,  we  also  call  a  reaction 
an  instant.  This  gives  sense  to  temporal  expressions  such  as  “instantaneously”  or 
“immediately”,  which  mean  “at  the  same  instant”,  or  “from  then  on”,  which  means 
“after  the  current  instant  included”,  or  “in  the  strict  future”,  which  means  “after  the 
current  instant  excluded”. 

We  assume  that  each  input  event  contains  a  special  signal  tick,  which  is  therefore 
present  at  all  instants.  This  addition  to  the  original  language  of  Berry  &  Gonthier 
(1988)  is  now  supported  by  the  ESTEREL  implementation.  The  tick  signal  is  analogous 
to  the  constant  1  in  circuits  or  the  constant  true  in  LUSTRE.  When  programming  digital 
circuits,  it  will  naturally  denote  clock  ticks. 

2.2  Modules 

The  basic  PURE  ESTEREL  programming  unit  is  the  module.  A  module  has  an  interface, 
which  specifies  its  input  signals  1,11,...  and  its  output  signals  0,01,...,  and  a  body , 
which  is  a  statement  that  specifies  its  behaviour2.  The  body  can  use  any  number  of  local 
signals  for  internal  broadcast  communication.  To  achieve  modular  programming,  a 
module  can  instantiate  other  modules  as  described  later  on.  Here  is  a  sample  module 
definition: 

module  M: 
input  II ,  12; 
output  01 ; 
statement. 

2.3  Kernel  statements 

The  primitive  or  kernel  PURE  ESTEREL  statements  are: 

nothing 

halt 


2 There  are  also  input/output  signals,  ignored  here  for  simplicity. 
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emit  S 
stat , ;  stat2 
loop  stat  end 

present  S  then  stat ,  else  stat2  end 

do  stat  watching  S 

stat,\\stat2 

trap  T  in  stat  end 

exit  T 

signal  S  in  stat  end 

One  can  use  brackets  ‘[’  and  ‘]’  to  group  statements;  by  default, binds  tighter  than  Ml’. 
Both  then  and  else  parts  are  optional  in  a  present  statement.  If  omitted,  they  are 
supposed  to  be  nothing. 

The  statements  are  imperative  and  manipulate  control  and  signals.  Most  of  them  are 
classical  in  appearance.  The  trap-exit  mechanism  is  an  exception  mechanism  fully 
compatible  with  parallelism.  Traps  are  lexically  scoped. 

The  local  signal  declaration  “signal  S  in  stat  end”  declares  a  lexically  scoped  signal  S 
that  can  be  used  for  internal  broadcast  communication  within  stat. 

2.4  The  intuitive  semantics 

The  intuitive  semantics  deals  with  control  transmission  between  statements  and  with 
signal  broadcasting.  A  statement  can  start  at  some  instant  and  remain  active  until  it 
releases  the  control  at  some  further  instant,  either  by  terminating  or  by  exiting  a  trap. 
After  termination  or  exit,  a  statement  becomes  inactive.  A  statement  that  terminates 
or  exits  at  the  same  instant  it  starts  is  said  to  be  instantaneous.  When  an  active 
statement  does  not  terminate  and  exits  no  trap  at  an  instant,  it  is  said  to  halt  at 
that  instant. 

The  intuitive  semantics  is  defined  by  structural  induction  on  statements: 

•  nothing  terminates  instantaneously. 

•  halt  never  terminates  nor  exits.  It  always  halts. 

•  An  “emit  S”  statement  broadcasts  the  signal  S  and  terminates  instantaneously. 

•  When  started,  a  sequence  “stat, ;  stat2  immediately  starts  stat ,  and  behaves  like  it.  If 
and  when  stat,  terminates,  stat2  starts  immediately  and  determines  the  behaviour  of 
the  sequence  from  then  on.  If  and  when  stat,  exits  a  trap  T,  so  does  the  whole 
sequence,  stat2  being  never  started  in  this  case.  Notice  that  stat2  is  also  never  started 
if  stat,  always  halts.  Notice  also  that  “emit  SI ;  emit  S2”  emits  SI  and  S2 
simultaneously  and  terminates  instantly. 

•  A  loop  acts  as  an  infinite  sequence.  When  started,  “loop  stat  end”  immediately  starts 
its  body  stat.  When  the  body  terminates,  it  is  immediately  restarted.  If  the  body  exits 
a  trap,  so  does  the  whole  loop.  The  body  of  a  loop  is  not  allowed  to  terminate 
instantaneously  when  started. 

•  When  a  “present  S  then  stat,  else  stat2  end”  statement  starts,  it  immediately  starts 
stat,  if  S  is  present  in  the  current  instant  and  stat2  if  S  is  absent.  The  present 
statement  then  behaves  as  the  corresponding  branch. 

•  The  “do  stat  watching  S”  watchdog  statement  immediately  starts  its  body  and 
behaves  like  it  until  the  time  guard  S  occurs. 

—  If  stat  terminates  or  exits  a  trap  strictly  before  S  occurs,  then  the  watching 
statement  instantaneously  terminates  or  exits  the  same  trap. 
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—  If,  in  the  strict  future  of  the  starting  instant,  S  occurs  while  stat  is  still  active,  then 
the  watching  statement  terminates  instantaneously  and  kills  stat,  which  is  not 
activated  in  the  corresponding  instant. 

Notice  two  boundary  problems:  the  guard  becomes  active  only  at  the  next  instant 
following  the  starting  instant;  the  body  is  not  activated  when  the  time  guard  elapses. 
As  we  shall  see  below,  all  other  possibilities  can  be  derived  by  combining  kernel 
statements,  which  would  not  be  true  with  another  choice  for  watching. 

•  When  started,  a  parallel  statement  “sfa^  llsfa?2”  immediately  starts  stat ,  and  stat2 
in  parallel.  A  parallel  terminates  instantly  if  and  when  both  staty  and  stat2  are 
terminated;  they  can  terminate  at  different  instants,  the  parallel  waiting  for  the  last 
one  to  terminate.  If,  at  some  instant,  one  statement  exits  a  trap  T  or  both 
statements  exit  the  same  trap  T,  then  the  parallel  exits  T.  If  both  statements  exit 
distinct  traps  T1  and  T2  at  the  same  instant,  then  the  parallel  only  exits  the 
outermost  of  these  traps,  the  other  one  being  discarded. 

•  The  statement  “trap  T  in  stat  end”  defines  a  lexically  scoped  trap  T  within  stat. 
When  the  trap  statement  starts,  it  immediately  starts  its  body  stat  and  behaves  like  it 
until  termination  or  exit.  If  the  body  terminates,  so  does  the  trap  statement.  If  the 
body  exits  T,  then  the  trap  statement  terminates  instantaneously.  If  the  body  exits 
an  enclosing  trap  U,  so  does  the  trap  statement  (traps  propagate). 

•  An  “exit  T”  statement  instantaneously  exits  the  trap  T. 

•  When  started,  the  statement  “signal  S  in  stat  end”  immediately  starts  its  body  stat 
with  a  fresh  signal  S,  overriding  the  one  that  may  already  exist.  The  statement 
behaves  as  its  body  from  then  on. 

A  global  coherence  law  relates  emission  and  testing: 

A  signal  is  present  at  an  instant  if  and  only  if  it  is  received  as  input  by  the  environment 
or  emitted  by  the  program  itself  at  that  instant. 

Remarks.  Notice  that  an  emission  is  transient,  and  that  there  is  an  asymmetry 
between  present  and  absent  signals.  There  is  an  emit  statement  to  set  a  signal  present, 
but  no  statement  to  set  it  absent:  by  the  coherence  law,  this  is  just  the  default. 

Notice  also  that  a  loop  never  terminates  by  itself;  the  only  way  to  end  it  is  to  kill  it  by 
elapsing  an  enclosing  time  guard  or  by  explicitly  exiting  an  enclosing  trap  from  within 
the  loop  or  from  a  statement  placed  in  parallel  with  the  loop. 

Finally,  notice  that  exiting  one  branch  of  a  parallel  terminates  instantaneously  the 
corresponding  trap  and  therefore  kills  the  whole  parallel.  All  parallel  branches  are 
activated  at  the  exit  instant.  For  example,  in  “emit  S  II  exit  T”,  the  left  branch  emits  S 
and  terminates,  the  right  branch  exits  T,  so  that  the  parallel  emits  S  and  synchronizes 
both  branches  by  deciding  to  exit  T.  Therefore,  being  killed  by  an  exit  is  less  severe  than 
being  killed  by  an  enclosing  watching  time  guard,  which  does  not  activate  its  body 
when  elapsed. 

2.5  Examples 

The  only  statement  that  provokes  halting  is  halt.  To  take  a  finite  but  non-zero  amount 
of  time,  a  statement  must  involve  halt  statements  guarded  by  watching  statements. 
The  simplest  example  is  “do  halt  watching  S”  which  waits  for  S  and  terminates:  by 
itself,  the  body  halt  would  halt  forever,  but  the  enclosing  “watching  S”  guard  kills  it 
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when  S  occurs,  and  it  makes  the  whole  statement  terminate.  Hence  the  statement  is 
guaranteed  to  “last  exactly  one  S”  from  the  time  it  is  started  (remembering  that  an  S 
present  when  the  statement  starts  is  not  taken  into  account).  Anticipating  on  the 
definition  of  derived  statements,  we  write  it  as  “await  S”. 

In  the  above  example,  S  can  be  any  signal,  a  second  as  well  as  a  centimetre,  a  clock 
tick,  or  generally  any  kind  of  interrupt.  Therefore,  each  signal  is  seen  as  defining  its  own 
time  unit.  Nesting  temporal  statements  bearing  on  different  time  units  is  the  main 
characteristic  of  the  esterel  style  (Berry  et  al  1988;  Berry  &  Gonthier  1988).  Here  is  a 
program  that  emits  repeatedly  0  every  I  until  reception  of  a  signal  STOP 

do 

loop 

await  I;  emit  0 
end 

watching  STOP 

Here  0  is  not  emitted  when  STOP  occurs,  even  if  I  is  present,  since  the  inner  loop  is 
preempted  by  the  external  watching  statement  at  that  instant. 

In  most  event  manipulation  languages,  the  basic  primitive  is  await,  that  waits  for  an 
event  to  start  a  computation  in  sequence.  On  the  contrary,  in  ESTEREL,  the  main 
primitive  is  watching,  that  waits  for  an  event  to  stop  or  preempt  a  computation.  It  is  a 
much  more  powerful  primitive  than  await.  In  particular,  it  is  easy  to  derive  await  from 
watching,  while  the  converse  is  definitely  not  true. 

Remember  the  boundary  problem  we  mentioned  when  describing  the  watching 
statement.  To  also  emit  0  if  I  is  present  when  STOP  occurs,  one  uses  a  trap: 

trap  T  in 

loop  await  I;  emit  0  end 
II 

await  STOP;  exit  T 
end 

This  works  since  when  one  branch  of  a  parallel  exits  a  trap  that  encloses  the  parallel,  the 
other  branch  is  activated  in  the  corresponding  instant  before  being  killed.  It  can 
perform  its  “last  wills”. 

The  other  boundary  problem  concerns  the  starting  instant.  If  one  wants  the  guard  to 
be  active  initially,  one  writes 

present  S  else  do  stat  watching  S  end 
readily  abbreviated  into  the  derived  statement 
do  stat  watching  immediate  S 

The  following  toy  example  illustrates  the  preemption  mechanism  involved  in 
concurrent  exits: 

trap  T1  in 
trap  T2  in 

emit  SI ;  exit  T1 
II 

exit  T2;  emit  S2 
end; 
emit  S3 
end 


102 


G  Berry 


The  first  parallel  branch  emits  SI  and  exits  T1 .  The  second  parallel  branch  exits  T2  but 
does  not  emit  S2  since  an  exit  statement  does  not  terminate.  The  body  of  the  parallel 
exits  simultaneously  T1  and  T2;  since  only  the  outermost  trap  matters,  T2  is  discarded 
and  T1  propagates.  Hence  S3  is  not  emitted,  and  the  outermost  trap  terminates  with 
only  SI  emitted. 

2.6  Full  ESTEREL 

The  full  language  has  many  useful  derived  statements.  We  briefly  describe  the  most 
important  ones.  See  Berry  &  Gonthier  (1988)  for  the  complete  list  and  for  the  exact 
expansion  into  kernel  statements. 

Temporal  statements:  A  temporal  statement  is  characterized  by  the  fact  that  its 
expansion  involves  present,  watching,  or  halt  kernel  statements.  We  have  already  seen 
the  simple  await  statement  and  the  immediate  guard  variant.  Here  are  some  other 
useful  constructs: 

•  Boolean  expressions  on  signals  can  appear  in  tests  or  guards,  as  in  “present  SI  and 
S2”  or  “do  stat  watching  not  S”. 

•  One  can  count  occurrences  of  a  signal  (or  boolean  expression)  within  a  time  guard, 
as  in  “await  3  S”.  Occurrence  counts  are  not  discussed  in  this  paper  but  are  easy  to 
handle. 

•  One  can  add  a  timeout  clause  to  be  executed  when  a  watching  statement 
terminates  by  elapsing  its  time  guard  and  not  when  the  body  terminates  by  itself: 

do  stat,  watching  S  timeout  stat2  end 

is  just  an  abbreviation  for: 

trap  T  in 

do  stat,  :  exit  T  watching  S; 
stat2 
end 

•  The  statement  “do  stefupto  S”  is  just  “do  stat:  halt  watching  S”.  Even  if  the  body 
terminates,  the  upto  statement  waits  for  its  guard  to  elapse. 

•  Deterministic  event  selection  has  the  form: 

await 

case  SI  do  stat, 
case  S2  do  stat2 
end 

The  statement  waits  simultaneously  for  SI  and  S2.  If  one  of  them  occurs  alone,  the 
control  is  instantaneously  transferred  to  the  corresponding  statement.  If  both 
signals  occur  at  the  same  time,  the  control  is  transferred  to  SI  only.  This  guarantees 
determinism. 

•  There  are  two  temporal  loops: 

loop  stat  each  S 
every  S  do  stat  end 
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The  first  loop  starts  stat  at  once,  and  kills  and  restarts  it  afresh  whenever  S 
occurs.  The  second  loop  is  similar  but  initially  waits  for  S  to  start  stat. 

•  The  “sustain  S”  statement  emits  S  continuously.  It  abbreviates 

loop  emit  S  each  tick 

General  traps:  There  is  a  general  exception  handling  mechanism  that  extends  basic 
traps: 

trap  T1,  T2  in 
stat 

handle  T1  do  stat, 
handle  T2  do  stat2 
end 

When  a  trap  is  exited,  the  corresponding  handler  is  started  instantaneously.  Here  the 
traps  T1  and  T2  are  concurrent.  If  they  are  exited  simultaneously,  both  handlers  are  run 
in  parallel. 

Module  instantiation:  Modular  programming  is  achieved  by  the  run  statement,  which 
instantiates  a  module  in  place,  possibly  invoking  signal  renamings: 

run  M  [signal  S/I] 

A  run  statement  terminates  if  and  when  the  copied  module  body  does. 


3.  The  behavioral  semantics 


Several  mathematical  semantics  have  been  developed  for  esterel,  including  a 
denotational  semantics  that  precisely  formalizes  the  intuitive  temporal  concepts 
presented  §  2-3,  see  Gonthier  (1988).  Here  we  prefer  to  use  the  behavioral  semantics 
(Berry  &  Gonthier  1988)  that  defines  execution  reaction  by  reaction,  using  Plotkin’s 
Structural  Operational  Semantics  technique  (SOS  for  short).  It  is  shown  equivalent 
to  the  denotational  one  in  Gonthier  (1988). 


3.1  Form  of  the  rules 

The  behavioral  semantics  defines  transitions  of  the  form  M^»M'  where  M  is  a 
module,  /  is  an  input  event,  O  is  the  corresponding  output  event,  and  M'  is  a  new 
module  that  will  correctly  respond  to  the  next  input  events.  In  other  words,  M'  is 
the  new  state  of  M  after  the  reaction  to  /.  The  reaction  Ol ,  02, . . . ,  On, . . .  to  an  input 
sequence  /x,/ 2,  .  is  then  defined  inductively  by  chaining  elementary  reactions: 


M 


Mi 


M.-i 


/„+i 


A  behavioral  transition  M  M'  is  computed  using  an  auxiliary  relation  stat  ^  stat 
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defined  by  structural  induction  on  statements.  Here  E  is  the  current  event  in  which  stat 
evolves,  E'  is  the  event  made  of  the  signals  emitted  by  stat,  and  k  is  an  integer 
termination  level  that  codes  the  way  in  which  stat  terminates  or  exits  and  is  precisely 
defined  below. 

The  current  event  E  is  made  of  all  the  signals  that  are  present  at  the  given  instant; 
because  of  the  coherence  law,  E  must  contain  the  set  E'  of  emitted  signals,  which  in 
turn  depends  on  E.  Hence  E  and  E'  will  be  computed  as  fixpoints,  the  fixpoint  equation 
being  located  in  the  local  signal  rule  below. 

Let  stat  be  the  body  of  M  and  stat  be  the  body  of  M'.  The  relation  between  both 
transition  systems  is  as  follows: 

O  Q 

M  -*  AT  iff  stat  - 1 - >  stat  for  some  k 

I  /uOuJtick} 

(under  the  minor  restriction  that  no  input  signal  is  internally  emitted  by  stat,  see 
Berry  &  Gonthier  1988). 

Termination  levels 

The  termination  level  k  is  0  if  stat  terminates  in  the  current  instant,  1  if  stat  halts  in  the 
current  instant,  and  k  +  2  if  stat  exits  a  trap  T  that  is  k  trap  levels  above  it,  i.e.  if  the  exit 
must  be  propagated  through  k  —  1  traps  before  reaching  its  trap.  To  handle  the  exit 
level,  it  is  useful  to  first  decorate  the  exit  statements  with  the  corresponding  level,  as  in 
the  following  example: 

trap  T  in 
exit  T2 
II 

trap  U  in 
exit  T3 
II 

exit  U2 
end 
end 

Here  the  first  T  exit  and  the  U  exit  are  labeled  2  since  there  is  no  intermediate  trap 
statement  to  traverse,  while  the  second  T  exit  is  labeled  3  since  one  must  traverse  the 
trap  U  statement  to  reach  the  trap  T  statement.  This  way  of  handling  termination  is 
simpler  than  the  one  used  in  Berry  &  Gonthier  (1988),  but  equivalent  to  it  as  shown  in 
Gonthier  (1988)  (see  also  Cousineau  1980). 

3.2  Inductive  rules 

The  nothing  statement  terminates  instantaneously. 

0.0 

nothing  — ►  nothing 

E 

The  halt  statements  halts  and  rewrites  into  itself. 
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An  emit  statement  emits  its  signal  and  terminates, 
emits  nothing 

E 

If  the  first  statement  of  a  sequence  terminates,  the  second  statement  is  started  at  once; 
the  emitted  signals  are  merged  to  form  the  resulting  emitted  event,  according  to  perfect 
synchrony. 

stat ,  stat\  stat2  stat'2 


stat, ;  stat2  £;u£^2  >  stat' 

E 

If  the  first  statement  of  a  sequence  does  not  terminate,  that  is  if  it  halts  or  exits  a  trap,  the 
sequence  behaves  just  as  the  first  statement  and  the  second  statement  is  kept  unchanged 
for  further  reactions. 

stat,  £l’*'  >  stat’,  ki>0 
e 

stat , ;  stat2  £'*'->  stat\ ;  stat2 

E 

A  loop  instantaneously  unfolds  itself  once.  Its  body  is  not  allowed  to  terminate 
instantaneously. 

stat  stat’  k>  0 

£ 

loop  stat  end  stat’;  loop  stat  end 

E 

A  present  statement  instantaneously  selects  its  then  branch  if  the  signal  tested  for  is 
present  in  the  current  instant.  Otherwise,  it  instantaneously  selects  its  else  branch. 

Se£  stat,  stat’, 

E 

present  S  then  stat ,  else  stat2  end  stat’, 

E 

S  iE  stat2  stat’ 

E 

present  S  then  stat ,  else  stat2  end  ElJi2>stat'2 

E 

A  watching  statement  transfers  the  control  to  its  body  and  rewrites  itself  into  a  present 
statement  in  order  to  set  the  time  guard  at  next  instant  if  the  body  has  halted. 

stat,  stat' 

E 

do  stat  watching  S  present  S  else  do  stat’  watching  S  end 

£ 

A  parallel  statement  starts  its  branches  instantaneously,  merges  the  emitted  signals, 
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and  returns  the  max  of  the  termination  codes.  We  leave  it  to  the  reader  to  see  that  this 
max  operation  exactly  performs  the  required  synchronization  in  all  termination  cases. 

stat \  stat\  stat2  -^4  stat'2 

E  E 

stat 1  II stat2  - — £'uE'1'maxa»'jt2> — v  stat\  I \stat' 

E 

A  trap  terminates  if  its  body  terminates  or  exits  the  trap,  that  is  returns  termination 
code  2.  If  the  body  halts,  so  does  the  trap.  If  the  body  exists  an  enclosing  trap,  then  the 
exit  is  propagated  by  subtracting  1  from  the  exit  level. 

stat  stat'  k  =  0  or  k  =■  2 

E 

trap  T  in  stat  end  nothing 

E 

stat  -£-^+  stat'  (k  =  1  and  k'  —  1)  or  (k  >  2  and  k’  =  k  —  1) 

.  e 

trap  T  in  stat  end  -^Atrap  T  in  stat  end 

E 

An  exit  statement  returns  its  exit  level, 
exit  T*  halt 

E 

Finally,  the  local  signal  declaration  rules  wind  up  the  events  E  and  £'  according  to  the 
coherence  law  given  in  §  2-3.  Within  the  body,  they  impose  that  a  local  signal  is  present 
in  £  if  and  only  if  it  is  emitted  in  £'.  A  local  signal  is  obviously  not  propagated  outside  its 
declaration. 


stat  Fu-{5^  stat'  S$E' 

Eu{S} 

signal  S  in  stat  end  -^signal  S  in  stat'  end 
stat  stat'  S$E' 

E~{S) 

signal  S  in  stat  end  signal  S  in  stat '  end 

E 

Remarks.  The  resulting  statement  stat  is  unused  and  therefore  immaterial  for  any 
rule  returning  k  >  1;  it  is  discarded  by  the  exited  trap.  If  a  rule  returns  k  =  0,  then  its 
resulting  term  is  equivalent  to  nothing. 

Because  of  the  intrinsic  fixpoint  character  of  the  local  signal  rule,  our  inference 
system  does  not  yield  a  straightforward  algorithm  to  compute  a  transition.  Given  any 
input  /  one  must  guess  the  right  current  event  £  and  use  the  rules  to  check  that  there  is  a 
correct  transition.  Other  semantics  yield  finer  analysis  and  efficient  algorithms  to 
compute  the  reaction;  see  in  particular  the  computational  semantics  in  Berry  & 
Gonthier  (1988). 
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3.3  Correct  programs 

Not  all  ESTEREL  programs  make  sense.  We  say  that  a  module  M  is  locally  correct  if 
there  is  only  one  provable  transition  M  f+M'  for  any  input  event  /.  We  say  that  M 
is  correct  if  it  is  locally  correct  and  if  all  modules  obtained  by  all  possible  sequences  of 
provable  transitions  are  locally  correct. 

Correctness  of  ESTEREL  programs  is  a  difficult  issue.  It  is  similar  to  correctness  of 
digital  circuits  (absence  of  races),  although  much  more  complex  because  of  the  power  of 
the  ESTEREL  instantaneous  loop  construct.  The  ESTEREL  compiler  checks  for 
reasonably  general  sufficient  correctness  conditions,  see  Berry  &  Gonthier  (1 988).  Here, 
we  just  show  two  examples  of  (locally)  incorrect  programs. 

The  following  program  has  no  fixpoint,  since  S  should  not  be  emitted  if  present  and 
emitted  if  not  present.  It  is  analogous  to  X  =  ~ i  X  in  circuits. 

signal  S  in 

present  S  else  emit  S  end 
end 

The  next  program  has  two  fixpoints,  one  of  SI  or  S2  being  present  in  each.  It  is 
similar  to  Xx  =  nl2,  X2=~\Xx  in  circuits. 

signal  SI,  S2  in 

present  SI  else  emit  S2  end 
II 

present  S2  else  emit  SI  end 
end 


4.  The  haltset  coding  of  states 

We  now  present  an  essential  concept  of  the  theory  of  ESTEREL,  the  unambiguous 
coding  of  any  state  by  a  set  of  control  points  in  the  original  program.  Technically, 
control  points  are  represented  by  halt  positions  in  the  kernel  expansion  of  the  module 
body  (notice  that  the  expansion  of  any  derived  temporal  statement  generates  at  most 
one  halt).  Since  ESTEREL  is  concurrent,  a  state  is  given  by  a  set  of  control  positions, 
which  we  call  a  haltset.  The  haltset  coding  is  important  in  two  respects.  First,  its 
existence  shows  the  rationality  of  ESTEREL:  only  finitely  many  statements  can  be 
generated  by  the  rewritings  of  a  given  statement.  Second,  it  is  the  direct  basis  of  the 
hardware  implementation,  and  it  is  also  heavily  used  in  the  software  implementation. 

The  reader  might  skip  this  section  at  first  reading  and  proceed  directly  with  the 
informal  presentation  of  the  hardware  translation  in  §  5.  However,  an  understanding  of 
the  material  presented  here  will  be  necessary  to  see  why  the  translation  is  done  that  way 
and  why  it  indeed  works. 

In  the  sequel,  we  consider  a  fixed  correct  module  M  of  expanded  body  stat.  For 
technical  reasons,  we  assume  that  the  body  of  M  never  terminates,  adding  a  trailing  halt 
if  necessary.  This  condition  does  not  change  the  observable  behaviors;  of  course, 
adding  a  trailing  halt  is  done  after  expansion  and  not  in  modules  copied  by  M. 

Call  a  derivative  of  stat  any  statement  stat'  that  can  be  reached  from  stat  by  some 
sequence  of  reaction  provable  in  the  behavioral  semantics.  So  far,  the  derivatives  are 
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defined  by  a  rewriting  process  and  bear  no  obvious  structural  relation  with  the  source 
term  stat.  We  show  that  any  derivative  can  be  unambiguously  coded  by  a  haltset  H  of 
stat,  that  is  by  a  set  of  occurrences  of  halt  statements  in  the  kernel  statement  stat. 

Consider  for  example  the  derivatives  of  "await  SI ;  await  S2;  halt".  There  are  three 
halt  statements,  the  two  first  ones  being  respectively  generated  by  the  first  and  the 
second  await.  Number  them  0, 1,2.  The  whole  statement  itself  will  be  coded  by  the 
empty  haltset  0.  The  derivative  that  waits  for  SI  is 

present  SI  else 
await  SI 
end; 

await  S2; 
halt 

Its  haltset  will  be  {0},  the  index  of  the  halt  generated  by  the  active  "await  SI" 
statement.  The  derivative  that  waits  for  S2  is 

present  S2  else 
await  S2 
end; 
halt 

Its  haltset  will  be  {1}  since  the  second  await  is  active.  The  final  derivative  is  halt,  coded 
by  {2}.  Non-singleton  haltsets  will  be  constructed  by  the  parallel  operator,  which  will 
return  the  union  of  the  haltsets  of  its  branches. 

4.1  Haltsets 

We  number  all  occurrences  of  halt  in  stat  by  distinct  integers  from  0  to  n,  n  >  0.  Then  a 
haltset  H  is  a  subset  of  [0..n]  that  satisfies  the  following  separation  condition:  If  stat , 
and  stat2  are  the  two  statements  of  a  sequence  or  the  two  branches  of  a  present  test, 
then  H  cannot  contain  an  occurrence  of  halt  in  stat 1  together  with  an  occurrence  of  halt 
in  stat2. 

We  decorate  the  behavioral  semantics  rules  by  returning  a  haltset  H  when  executing 
a  numbered  term.  This  haltset  will  record  the  places  where  the  term  has  halted.  The 
rules  take  the  new  form  stat  E  kEH-+stat'.  We  always  return  H  =  0when  k  ^  1  and 
H  /0when  k  =  1.  Adding  haltsets  is  easy  for  all  rules  except  the  parallel  one.  Executed 
halt  statements  are  put  into  the  haltset  by  the  rule  of  halt  and  propagated  by  the  other 
rules.  Since  the  transformation  is  fairly  obvious,  we  just  list  a  few  rules  and  leave  the 
other  ones  to  be  reader. 


nothing  0,o’0>  nothing 


£ 


£ 


stat ,  £‘,o,0>  stat',  stat2 

E 


stat, ;  stat 2 


stat 

E 


A  hardware  implementation  of  pure  ESTEREL 


109 


stef,  E'"k"H'>stat\  fc,  >  0 

E 

stat , ;  stat2  E"kl-’-b staff,  stat2 

E 

stat  EX®  >  stef'  fc  =  0  or  k  =  2 

£  

trap  T  in  stef  end  nothing 

£ 

stef  E,k'H  >  stef'  (fc  =  1  and  fc'  =  1)  or  (fc  >  2  and  fc'  =  k  —  1) 

£  

trap  T  in  stat  end  £  /c  jy-»  trap  T  in  stat1  end 

£ 

For  a  parallel,  we  return  the  union  of  the  haltsets  returned  by  the  branches  unless 
one  of  the  branches  exits  a  trap,  in  which  case  we  return  an  empty  haltset.  We  make 
an  additional  technical  modification  explained  later  on:  when  one  branch  terminates, 
we  rewrite  it  into  nothing. 

stat ,  E’"k"H-'->stat\ 

E 

Stat2  stef  j 

E 

(H1kjH2  if  max(k1,k2)^  1 
[0  if  max(k1,k2 )  >  1 

stef"=  \stat’ lf  /c'  /0 

'  (nothing  if  fc,.  =  0 

stef,  1 1  stef2  stef;'  1 1  stef'' 

£ 

Since  a  module  body  is  supposed  to  always  halt,,  its  global  termination  code  must 
be  1.  Hence  the  rule  always  returns  a  well-defined  haltset  H  for  any  immediate 
derivative.  This  haltset  is  easily  seen  to  satisfy  the  separation  condition. 

4.2  Recovering  derivatives  from  haltsets 

We  now  recover  the  derivative  stef'  from  stef  and  H.  We  proceed  in  two  steps.  First  we 
define  a  labeled  term  statH  obtained  by  labeling  the  subterms  of  stat  by  either  H  +  or 
H  — ;  a  subterm  is  labeled  H  +  if  and  only  if  it  contains  at  least  one  occurrence  of  halt 
whose  number  is  in  H.  If  we  care  about  the  label  of  stat H  itself,  then  we  write  it  explicitly, 
as  in  statH+ .  The  labels  are  of  course  redundant  with  H ,  but  they  make  the  definitions 
and  proofs  much  simpler  to  write. 

Then  we  define  a  term  FI  ( statH )  by  structural  induction  on  statH.  Subterms  labeled 
by  -  and  halt  statements  are  left  unchanged. 

R  (stefw  “  )  =  stef 
/?(ha!t/W)  =  halt'. 


110 


G  Berry 


trap  and  local  signal  declaration  constructs  are  handled  by  trivial  structural  induction. 

fl(trap  T  in  stat  end")  =trap  T  in  R(stat")  end 
R  (signal  S  in  stat  end")  =  signal  S  in  R(statH)  end  . 


The  only  non-trivial  cases  are: 


/?(loop  stat  +  end)  = 


R  (stat" +;  stat"  ~ )  =  R  (stat"  + ),  stat 2 
R  (stat"  stat"  +)  =  R  (stat"+ ) 

R  (stat"+ ); 
loop  stat 1  end 
R  (present  S  then  stat"+  else  ste?"~end)  —  R  (stat"+ ) 

R  (present  S  then  stat"~  else  stat"+  end)  =  R(stat"+) 

present  S  else 

R  (do  stat" +  watching  S)=  do  R(stat"+)  watching  S 

end 


R  (stat" +\\  stat"  + )  =  R  (stat"  +  )||  R  (stat"  + ) 
R  (stat"+\\stat"~ )  =  R  (stat"  +  )l I  nothing 
R  (stat"~\\stat"+ )  =  nothing  1 1 R  (stat"  +  ) 


Notice  that  these  definitions  make  sense  only  when  the  separation  condition  is  satisfied. 
Notice  also  why  we  return  nothing  in  the  semantics  rules  when  a  branch  terminates: 
this  simplifies  the  definition  of  R. 

Since  they  exactly  reproduce  the  (new)  behavioral  rules  right-hand  side  terms,  one 
easily  shows  R(stat")  =  stat'  as  expected. 

We  now  give  the  main  result:  the  coding  extends  from  immediate  derivatives  to 
general  ones.  This  is  not  completely  obvious  since  the  R  operator  can  duplicate  halts  in 
the  loop  case.  The  result  is  as  follows: 


Theorem.  Let  stat  be  the  body  of  a  correct  program.  Let  H  be  a  haltset  in  stat.  Then  for 
any  behavioral  rewriting  of  the  form 


R  (stat")  ^JLstat' 

E 

the  haltset  H'  contains  only  halts  occurring  in  stat'  and  one  has  stat'  —  R (stat" "). 


Proof.  The  proof  is  by  structural  induction  on  stat  and  by  case  inspection  on  the  rule 
applied  to  the  whole  term  R  (stat")  to  yield  stat’.  All  cases  being  similar,  we  treat  the 
sequence  and  the  loop  as  examples.  We  consider  a  given  current  event  E. 

Let  first  stat  =  stat : ;  stat2.  There  are  three  cases  according  to  the  labeling  generated 
by  H. 

•  If  stat2  is  labeled  by  H+  ,  then  R  (stat")  —  R  (stat").  By  correctness  and  by  the 
hypothesis  that  stat  halts,  R  (stat")  has  a  unique  rewriting  R  (stat")  E'l£H>  stat', 
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where  H'  is  a  nonempty  haltset  that  only  contains  halts  in  stat2.  By  induction,  one 
has  stat  =  R  (statf  + ).  Since  H'  is  all  in  stat2  and  nonempty,  one  has  R  ( stat2  + )  = 
R  ( statH )  by  definition  of  R  (  )  and  the  result  follows. 

•  The  two  other  cases  can  be  grouped  into  one.  They  correspond  to  a  term  stat  = 
R(stat")\  stat2,  taking  H  as  given  if  stat  —  R (stat,  + );  stat2  and  H  =  <p  if  stat 
itself  has  label  H  — .  By  correctness,  stat  has  a  unique  behaviour,  computed  by 
either  the  first  or  the  second  sequence  rule.  If  the  first  sequence  rule  is  used,  then 
stat'  is  generated  entirely  by  stat2  and  the  results  follow  as  in  the  first  case.  If  the 
second  sequence  rule  is  used,  the  termination  code  of  R  (stat,)  is  1  since  stat  halts. 
By  induction  and  by  the  form  of  the  rule,  one  has  stat'  =  R  (statf + );  stat2  = 
R(statH)  for  some  nonempty  H ’  having  all  its  halts  in  stat, .  The  result  follows. 

Assume  now  stat  =  loop  stat,  end.  There  are  two  subcases.  If  stat  is  labeled  by  H  — , 
then  R  ( statH  )  =  loop  stat,  end.  The  only  applicable  rule  is  the  loop  rule.  It  asks  for 
computing  stat, ,  which  must  halt  since  stat  does.  By  induction  and  by  the  loop  rule,  one 
has  stat  ’ ■->  R  (statf  + );  stat  for  some  H'.  The  last  term  is  just  R  (statH )  as  expected. 
If  stat  is  labeled  by  H  +  ,  then  R  ( statH+ )  =  R  ( stat"+  );  stat.  If  the  first  term  does  not 
terminate,  we  proceed  as  in  the  first  loop  case.  Otherwise,  the  loop  must  be  unfolded 
once  and  we  are  back  again  in  the  first  loop  case. 

COROLLARY. 

Let  stat  be  a  module  body.  Then  any  derivative  stat'  of  stat  is  equal  to  R  ( statH )  for  some 
haltset  H,  and  there  are  only  finitely  many  derivatives. 

Proof.  By  induction  on  the  length  of  a  rewriting  sequence  stat  stat',  since  stat  itself 
is  equal  to  R  (stat 0)  and  since  stat  always  returns  k=l.  The  finiteness  property  is 
obvious  since  there  are  only  finitely  many  possible  haltsets. 


5.  Principle  of  the  hardware  implementation 

In  this  section,  we  show  by  examples  how  to  translate  a  PURE  ESTEREL  program 
into  a  digital  circuit  that  computes  the  reaction  of  the  program  to  any  input  in  one 
clock  cycle.  The  translation  is  structural:  the  circuit  logical  geometry  is  the  same  as 
that  of  the  original  program.  The  translation  is  directly  based  on  the  haltset  coding 
theory  of  §4,  but  we  present  it  in  such  a  way  that  it  can  be  understood  independently 
of  this  coding. 

We  start  with  a  first  example  involving  only  halt  and  watching  statements.  Then  we 
show  how  to  handle  concurrency  and  exceptions.  Finally,  we  indicate  how  to  efficiently 
translate  the  full  language.  The  formal  translation  is  given  in  §  6. 

5.1  A  first  example 

Consider  the  following  program: 

module  M: 
input  I,  R; 
output  0; 
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loop 

loop 

await  I;  await  I;  emit  0 
end 

each  R. 

After  an  initialization  instant  in  which  /  is  ignored,  the  behavior  is  to  emit  0  every  two  I, 
restarting  this  behavior  afresh  each  R.  Expanded  into  kernel  statements,  the  body 
becomes: 

loop 

do 

loop 

do 

halt 

watching  I; 
do 
halt 

watching  I; 
emit  0; 
end 

watching  R 

end 

The  corresponding  circuit  is  drawn  in  figure  1.  It  has  two  input  pins  for  I  and  R  and  one 
output  pin  for  0.  There  are  four  kinds  of  cells,  called  Boot,  Watch,  Present,  and  Halt. 
Cell  output  pins  are  primed. 

The  Boot  and  Halt  cells  each  contain  one  register,  assuming  to  initially  contain  value 
0  and  to  be  clocked  by  the  global  circuit’s  clock.  The  other  cells  are  purely 
combinatorial.  The  Present  cells  are  used  for  present  and  watching  source  statements, 
each  source  "watching  S"  statement  being  conceptually  rewritten  into  "watch 
present  S";  This  slight  syntactic  modification  simplifies  the  cells  and  makes  it  easy  to 
implement  boolean  expressions. 

The  circuit  contains  three  sorts  of  wires:  the  selection  wires  s0-s5,  the  activation 
wires  a0-a5,  and  the  control  wires  c0-c8.  The  unconnected  i  and  c'  1  pins  of  Halt  cells 
correspond  to  other  wires  unused  here  and  described  later  on.  Whenever  two  wires  go 
to  the  same  place,  they  are  implicitly  assumed  to  be  combined  by  an  or  gate. 

The  selection  and  activation  wires  go  in  reverse  directions  and  form  a  tree  that  is 
called  the  skeleton  of  the  circuit.  This  tree  is  determined  by  the  nesting  of  halt, 
watching,  and  ||  statements  in  the  source  program,  following  the  abstract  syntax 
revealed  by  the  source  code  indentation.  The  leftmost  Halt  and  Watch  cells  correspond 
to  the  first  await,  the  rightmost  ones  to  the  second  await. 

The  selection  wires  are  used  to  determine  which  part  of  the  circuit  can  be  active  in  a 
given  state:  in  our  example,  both  await  statements  are  in  mutual  exclusion,  and  only 
one  of  them  can  be  active  at  a  time.  When  the  first  await  is  active,  the  wires  s2,  si ,  and 
sO  are  set  to  1.  When  the  second  await  is  active,  the  wires  s4,  s3,  and  sO  are  set  to  1.  The 
sources  of  the  selection  wires  are  the  Halt  cell  registers.  The  upper  selection  wire  sO  is 
unconnected  here,  but  we  left  it  there  to  emphasize  the  structural  character  of  the 
translation. 

The  activation  and  control  wires  bear  the  flow  of  control.  The  activation  wires 
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Figure  1.  First  example. 


handle  preemption  between  watching  statements.  In  our  example,  the  outermost 
watching  preempts  the  innermost  one;  by  the  semantics  of  ESTEREL,  if  R  is  present,  the 
outermost  watching  terminates  without  letting  its  body  execute.  The  upper  activation 
wire  aO  is  always  set. 

The  cells  are  defined  as  follows: 


Boot 


n:=  1 
b  —  ~\n 


Watch 


Present 


c't  =  c*S 
[c'f=c*— i  S 


Halt  {s':—  c  +  (a*s). 

The  notation  is  that  of  PALASM:  ‘  +  ’  is  or,  V  is  and,  ‘ — i  ’  is  not,  an  equality  is  valid  at 
all  times,  and  a  register  is  denoted  by  =  ’.  Registers  are  supposed  to  contain  initially  0. 
In  the  sequel,  we  say  that  a  wire  is  high  or  set  if  it  has  value  1  and  low  or  reset  if  it  has 
value  0.  We  say  that  a  register  is  set  if  it  gets  value  1  and  reset  if  it  gets  value  0.  Signals  are 
assumed  to  be  present  when  their  wire  is  set  and  absent  when  their  wire  is  reset. 
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The  output  signal  b  of  the  Boot  cell  is  high  at  first  clock  tick  and  then  remains  low. 
For  a  Halt  cell,  the  value  of  the  output  signal  s'  is  initially  low  and  then  that  of  c  +  (a*s) 
delayed  one  clock  cycle.  Hence  a  register  is  set  either  if  an  incoming  control  wire  is  set  or 
if  the  activation  wire  is  set  and  the  register  was  already  set3.  The  definition  of  Halt  is 
only  temporary:  further  pins  will  be  added  in  §  5.2. 

5.1a  A  sample  execution:  At  boot  time,  the  Halt  cell  registers  contain  0  and  the 
selection  wires  are  all  low;  the  boot  control  wire  b  is  high.  Because  of  the  cell  equations, 
all  other  wires  are  low.  Hence  the  only  effect  is  to  set  the  leftmost  Halt  register. 

On  next  clock  tick,  assume  that  I  is  present  and  R  is  absent.  Then  s2,  si ,  and  sO  are 
set  by  the  Halt  register.  Since  aO  is  always  set,  the  control  flows  down  by  setting  cO  that 
triggers  the  test  for  R  in  the  upper  Present  cell.  Since  R  is  low,  the  control  flows  through 
the  c'f  pin  and  sets  c2,  which  is  connected  to  the  c  input  pin  of  the  Watch  cell.  This  pin 
is  directly  connected  to  the  a'  output  pin,  and  the  control  flows  though  al  and  a4 
(which  are  connected  with  each  other  and  form  in  fact  a  single  equipotential).  Since 
both  s2  and  al  are  high,  the  leftmost  Watch  cell  sets  c3  and  the  leftmost  Present  cell 
sets  c4  since  I  is  present.  This  sets  the  rightmost  Halt  register.  Since  s4  is  low,  the 
rightmost  Watch  cell  is  inactive.  Having  no  incoming  control  set,  the  leftmost  Halt 
register  is  reset.  This  terminates  the  first  "await  I"  statement. 

On  next  clock  tick,  if  I  is  present,  the  execution  is  symmetrical:  the  rightmost  Halt  is 
reset  and  the  leftmost  one  is  set.  The  wires  set  are  s3,  s4,  aO,  cO,  c2,  al  =  a4,  c6,  and  c7. 
Since  c7  is  also  connected  to  the  output  0,  this  output  is  set.  If  instead  R  is  present,  the 
wires  set  are  s3,  s4,  aO,  cO,  cl ,  and  one  is  back  to  the  state  just  after  boot.  If  neither  I  nor 
R  are  present,  then  the  wires  set  are  s3,  s4,  aO,  cO,  c2,  al  =  a4,  c6,  c8,  and  a3,  and  the 
state  is  simply  restored  as  expected. 

5. 1  b  Relation  with  the  haltset  coding:  Intuitively,  the  relation  between  our  circuit  and 
the  haltset  coding  of  derivatives  is  as  follows: 

•  A  state  of  the  circuit  is  a  set  of  Halt  cells  set  to  1.  It  is  therefore  exactly  a  haltset. 

•  The  selection  wires  just  compute  the  +  and  —  labels  of  statements,  +  being 
represented  by  a  1  in  the  selection  wire. 

•  Sending  the  control  to  the  translation  of  a  subterm  stat ,  by  setting  an  incoming 
control  wire  amounts  to  execute  statv  For  example,  setting  b  executes  the  whole 
statement,  setting  b  or  cl  executes  the  first  await  I,  and  setting  c4  executes  the 
second  await  I. 

•  Sending  the  control  to  the  translation  of  a  subterm  stat ,  by  setting  its  incoming 
activation  wire  amounts  to  execute  R  (stat")  if  stat ^  is  labeled  by  +  in  H,  i.e.  if  the 
corresponding  selection  wire  is  set. 

Hence,  in  a  haltset  H  and  an  input  /,  the  circuit  just  mimics  the  behavioral  proof  of 
R(statH)  in  I.  This  point  will  be  made  very  precise  in  §7. 

Notice  that  the  Boot  cell  is  not  really  necessary  since  the  initial  state  can  also  be 
recognized  as  the  only  state  where  all  Halt  cells  have  value  0,  that  is  where  the  wire  sO  is 
low.  We  could  as  well  connect  the  b  wire  to  the  negation  of  sO.  However,  it  is  convenient 
in  practice  to  add  the  auxiliary  Boot  cell  to  reduce  the  length  of  wires  and  the  number  of 
logical  levels. 


3 The  multiplication  by  s  is  there  to  prevent  setting  the  second  Halt  register  in  a  term  such  as 
“do  halt;  halt  watching  S”  when  a  is  set. 
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5.2  Translating  parallel  and  exceptions 

The  most  complex  operator  is  of  course  the  parallel  one,  since  it  must  synchronize  the 
termination  of  its  branches  and  propagate  exceptions.  Consider  the  following  program 
fragment: 

trap  T  in 
await  SI 
II 

present  S2  then  exit  T  end 
end 

The  corresponding  circuit  fragment  is  shown  in  figure  2.  The  leftmost  Watch  -  Present- 
Halt  cell  group  is  generated  by  “await  SI ".  The  rightmost  Present  cell  is  generated  by 
"present  S2",  (where  "else  nothing"  was  omitted  as  usual).  The  branches  are  simply 
put  in  parallel  and  synchronized  by  the  Parallel  cell.  The  circuit  fragment  starts  when  it 
receives  control  by  setting  the  cO  wire. 

The  Parallel  cell  has  two  parts:  the  fork  part,  which  involves  the  six  leftmost  pins,  and 
the  synchronization  part,  which  involves  the  eight  rightmost  ones. 

The  fork  part  is  simple:  selection  wires  are  gathered  by  an  or  gate  and  activation 
and  control  are  dispatched  to  branches. 

The  synchronization  part  is  more  subtle.  The  pins  c  0,  cl,  and  c2  record  the  different 
termination  modes  according  to  their  codes  defined  in  section:  c  0  means  termination, 
cl  means  halt,  and  c2  means  exiting  T.  With  each  termination  pin  ci  is  associated  a 


Figure  2.  Second  example. 
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continuation  pin  c'i.  (In  fact,  c'l  is  not  really  a  continuation  in  a  usual  sense:  it  is 
recursively  linked  to  the  cl  entry  of  the  enclosing  Parallel  cell  when  such  a  cell  exists.) 

As  explained  in  §  3,  the  synchronization  realized  by  the  parallel  amounts  to  compute 
the  max  of  the  termination  codes  of  its  branches  and  to  only  activate  the  corresponding 
continuation.  It  therefore  uses  a  priority  queue. 

In  our  example,  the  left  branch  can  halt,  as  signaled  by  setting  wire  c5,  or  terminate, 
as  signaled  by  setting  wire  c3.  The  rightmost  brach  can  terminate  or  exit  T  as 
respectively  signaled  by  setting  wires  c7  and  c6.  Since  exiting  T  or  terminating  the 
parallel  lead  to  the  same  continuation,  the  continuations  wires  c8  and  cl  0  will  reach 
the  same  input  pin  in  any  global  circuit  in  which  our  fragment  is  placed. 

When  the  right  branch  exits  T,  the  leftmost  branch  must  be  killed;  technically,  its  halt 
statements  must  be  removed  from  the  current  haltset.  This  is  the  role  of  the  inhibition 
wire  il  that  sends  an  inhibition  signal  to  the  halt  register.  In  an  actual  execution 
context,  the  inhibition  signal  can  also  come  from  an  enclosing  parallel  statement  itself 
killed  by  some  trap  exit.  It  is  then  received  on  pin  i  by  the  wire  iO. 

The  final  equations  of  the  Parallel  and  Halt  cells  are: 


a'  —  a 
d  =  c 
c'2  =  c2 


Parallel 


pi  =  c'2 
d  1 =  cl  *  i pi 
pO  =  cl  +  pi 
c'O  =  c0*— 1  p  0 
i'  =  i  +  pi 


Halt 


d  1  =  c  +  (a*  s) 
s':  =  (c  +  (a*s))*— 1  i 


where  pO  and  pi  are  local  wires  used  to  compute  the  parallel  continuation  and 
inhibition  values:  if  ci  is  the  selected  continuation,  ci  is  set  and  all  continuations  cj  are 
reset  for  j  ^  i,  and  i!  is  set  if  pi  is. 


5.2a  A  sample  execution:  Assume  the  circuit  receives  control  by  cO  and  therefore 
sets  cl . 

•  Assume  S2  is  present.  Then  c5  is  set  by  the  Halt  cell  and  c6  is  set  by  the  right 
Present  cell.  The  parallel  cell  selects  the  appropriate  continuation  cl  0  and  inhibits 
the  halt  register  by  setting  il . 

•  Assume  instead  S2  is  absent.  Then  c5  is  set  by  the  Halt  cell  and  c7  is  set  by  the  right 
Present  cell.  The  selected  continuation  is  c9;  it  signals  halting  to  an  eventual 
enclosing  parallel  statement.  Since  the  inhibition  wire  il  is  low,  the  Halt  cell  register 
is  set.  The  circuit  then  remains  in  the  same  state  in  further  clock  cycles  as  long  as  the 
activation  wire  aO  remains  high  and  SI  remains  low:  the  wires  set  are  s2,  si ,  sO,  al , 
c2,  c4,  a2,  c5,  and  c9.  If  aO  remains  high  and  SI  is  reset,  the  wires  set  are  s2,  si ,  sO, 
al ,  c2,  c3,  and  c8.  The  whole  construct  terminates  and  the  register  is  reset  since  cl 
and  a2  are  low.  The  incoming  activation  wire  aO  can  also  become  low  before  SI 
occurs,  for  example  because  an  enclosing  watchdog  elapses.  Then  the  Halt  register 
is  also  reset. 
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5.2b  General  parallel  cells:  In  fact,  the  size  of  the  priority  queue  in  a  parallel  cell 
depends  on  the  number  of  nested  traps  exited  from  within  its  source  parallel  statement. 
The  number  of  pins  ci,  c'i  for  i  2  correspond  to  the  number  of  enclosing  traps.  With 
no  trap,  there  is  no  such  pin.  The  example  explained  one  level  of  trap.  With  two  levels  of 
traps,  as  in 

trap  U  in 
trap  T  in 

-II- 

end 

end 

there  would  be  a  pin  c2  for  T  and  a  pin  c3  for  U,  and  so  on. 

5.3  Summary  of  the  translation 

The  translation  is  done  by  connecting  together  cells  corresponding  to  source 
statements.  The  cells  are  the  same  for  all  programs,  but  the  parallel  cells  have  variable 
continuation  parity  according  to  the  number  of  enclosing  traps. 

The  logical  skeleton  of  the  translation  is  given  by  the  tree  of  Halt,  Watch,  and  Parallel 
cells  which  mimics  the  tree  of  source  halt,  watching,  and  II  statements.  Each  edge  of  the 
tree  is  composed  of  an  upward  selection  wire  and  a  downward  activation  wire.  Two  sets 
of  wires  reinforce  the  skeleton:  control  wires  that  signal  halting  and  go  upwards  from 
Halt  and  Parallel  cells  to  Parallel  cells,  and  opposite  inhibition  wires  that  force 
resetting  the  Halt  registers  in  case  of  exceptions. 

In  addition  to  the  above  cells,  one  finds  a  Boot  cell  used  to  boot  the  circuit,  and 
Present  cells  generated  by  source  present  and  watching  statements.  These  cells  are 
linked  together  and  to  skeleton  cells  by  control  wires.  Each  Present  cells  also  receives  as 
input  a  signal  wire.  Signal  wires  come  either  from  input  signal  pins  or  from  local  signal 
cells,  which  are  simply  or  gates.  Control  wires  transfer  the  control  from  cell  to  cell.  They 
also  emit  signals  by  being  connected  to  output  signal  pins  or  to  local  signal  or  gates. 
The  wiring  of  control  wires  is  determined  by  a  continuation  analysis,  see  §  6. 

5.4  Optimization 

The  reader  may  find  that  our  circuits  contain  lots  of  wires  and  of  logical  levels,  even  for 
simple  programs.  In  fact,  this  is  because  they  are  obtained  by  a  structural  translation 
process  and  there  is  much  room  for  automatic  optimization.  Many  wires  are  simply 
connected  with  each  other.  Many  generated  logical  functions  are  readily  grouped  by 
logic  optimizers.  Constant  folding  can  also  be  used:  for  instance,  the  top  activation  wire 
is  always  set;  using  this  fact,  one  can  statically  simplify  many  gates. 

Therefore,  our  circuits  should  not  be  directly  implemented;  they  should  instead  be 
given  as  input  to  logic  optimizers.  We  presently  use  optimizers  based  on  Binary 
Decision  Dags  (or  BDD),  see  Brayton  et  al  (1990),  Coudert  &  Madre  (1990)  and  Savoj 
et  al  (1991).  They  drastically  reduce  the  actual  size  of  circuits.  They  can  also  discover 
redundancies  between  registers  and  suppress  some  of  them  (Berthet  et  al  1990). 

Altogether,  we  believe  that  we  can  obtain  final  circuits  that  are  as  good  as  carefully 
hand-designed  ones.  Because  of  the  power  and  efficiency  of  BDD-based  optimization 
techniques,  we  think  there  is  no  need  to  search  for  a  more  sophisticated  translation 
process. 
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5.5  The  translation  is  sometimes  incorrect 

Our  translation  does  not  translate  correctly  all  programs.  There  are  difficulties  with 
local  signals  and  with  loops  over  parallel  statements. 

First,  we  have  allocated  a  single  wire  for  a  local  signal.  But  even  within  a  single 
reaction,  an  ESTEREL  signal  can  have  several  independent  avatars.  Consider  a 
statement  of  the  form 

loop 

signal  S  in  stat  end 
end 

When  the  body  terminates,  it  is  restarted  at  the  same  instant  with  a  fresh  signal  S. 
This  is  made  obvious  by  unfolding  the  body  to  get 

loop 

signal  S  in  stat  end; 
signal  S  in  stat  end 
end 

which  is  semantically  equal  and  where  there  are  clearly  two  distinct  signals. 

In  our  circuits,  a  signal  wire  has  only  one  state  at  a  time:  we  cannot  implement 
general  local  signals.  We  must  require  all  local  signals  to  be  declared  at  top  level  in 
the  module  body,  This  is  not  too  big  a  restriction  in  practice. 

The  second  incorrectness  is  more  subtle.  The  translation  of  the  statement 

loop 
await  S 
end 

is  correct,  but  the  translation  of  the  equivalent  statement 

loop 
await  S 
II 

nothing 

end 

is  not  since  it  involves  an  unstable  combinatorial  loop  through  the  parallel 
synchronizer:  when  S  occurs,  the  parallel  terminates  and  the  loop  makes  it  halt  at  the 
same  time  on  await  S.  But  halting  justs  inhibits  the  termination  that  should  provoke  it, 
hence  the  combinatorial  loop.  Unfolding  the  body  would  solve  the  problem;  it  still 
builds  a  combinatorial  loop,  but  this  time  a  safe  one. 

The  ESTEREL  software  checks  for  sufficient  conditions  for  translation  correctness. 
We  are  presently  investigating  a  more  powerful  translation  that  will  correctly  translate 
all  ESTEREL  programs.  It  will  be  reported  in  another  paper. 


6.  The  formal  translation  to  hardware 

We  define  the  translation  formally  and  prove  its  correctness  in  absence  of  bad  loops 
over  parallels.  As  explained  in  §5,  we  assume  all  local  signal  declarations  to  be  at 
top  level  in  the  module  body. 
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6.1  Circuits 

We  consider  a  circuit  to  be  given  by  a  set  of  input  wires,  a  set  of  output  wires,  a  set 
of  local  wires  and  a  set  of  wire  definitions  that  define  output  and  local  wires.  There 
are  two  kinds  of  wire  definitions: 

•  An  implication  definition  w<^exp  expresses  a  partial  definition,  read  as  “connect 
exp  to  w”.  There  can  be  several  implications  per  wire. 

•  A  register  definition  w:  =  exp  defines  a  wire  to  be  initially  0  and  then  the  value  of 
exp  at  previous  clock  cycle.  There  can  be  only  one  register  definition  per  wire. 

Given  a  circuit  C  and  a  wire  w,  the  set  of  implications  w<=expt  in  C  defines  vv  as 
w  =  Vjcxp;.  Hence  the  right-hand-sides  of  implications  are  connected  to  an  or  gate. 
If  a  wire  w  has  no  definition,  it  is  considered  to  have  an  empty  set  of  implication 
definitions,  and  is  therefore  to  be  defined  by  w  =  0.  To  stress  the  fact  that  a  wire  has  a 
single  implication  definition  in  a  circuit,  we  can  write  this  definition  using  ‘  =  ’  instead 
of  *<=’. 

Given  any  register  state  and  any  input,  the  semantics  of  a  circuit  is  classically 
defined  as  a  unique  fixpoint  of  the  equations,  and  a  circuit  is  correct  if  a  unique 
fixpoint  always  exist  in  any  (reachable)  state.  We  assume  this  to  be  well-known. 

6.2  The  translation  environment 

The  formal  translation  is  done  by  natural  semantics  inference  rules  (Kahn  1988).  The 
sequents  have  the  form  p b static,  where  p  is  a  wire  environment,  stat  is  an  ESTEREL 
statement,  and  C  is  the  resulting  circuit. 

As  in  natural  semantics  or  in  PROLOG,  allocation  of  new  wires  is  implicit  and  done 
when  encountering  free  variables.  To  make  things  clear,  we  shall  comment  on  each 
rule  and  explicitly  tell  which  are  the  newly  allocated  wires. 

The  environment  p  is  made  of  several  wires,  whose  functions  have  been  explained 
in  §  5.  It  contains  the  following  fields 

•  An  incoming  control  wire  c. 

•  A  selection  wire  s. 

•  An  activation  wire  a. 

•  An  inhibition  wire  i. 

•  A  vector  of  continuation  wires  c.  The  wire  c°  corresponds  to  termination,  the  wire 
c1  corresponds  to  halting,  the  wire  ck  +  2  corresponds  to  exit  k  +  2,  that  is  to  exiting 
k  trap  levels. 

•  A  set  of  signal  wires  S,  one  for  each  input,  output,  or  local  signal  S.  For  simplicity, 
we  assume  that  all  local  signals  have  distinct  names;  then  all  local  signal  wires 
can  be  preallocated. 

We  use  the  classical  dot  notation  to  get  environment  components:  for  instance,  p.c 
denotes  the  control  wire  of  p.  Given  an  environment  p,  we  shall  often  need  to  consider 
another  environment  p'  that  differs  from  p  by  the  value  of  one  field,  say  by  changing 
p.c  into  c' .  We  then  write  p'  =  p  [c  <-  c'].  The  notation  extends  naturally  when  changing 
several  fields. 

To  translate  a  module,  we  allocate  a  boot  control  wire  b  and  a  register  n  of  equations 
b  =  i  n,  n :  —  1  as  in  §  5,  a  dummy  selection  wire  s,  two  dummy  wires  cO  and  cl  for  the 
(unused)  continuations,  a  dummy  inhibition  wire  i,  and  one  wire  S  per  signal, 
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declaring  respectively  input  and  output  signals  as  inputs  and  outputs  to  the  circuit. 
We  translate  the  module  body  in  the  environment 

Po  =  (b,s,  1,  i,  (cO,  cl ),  S). 


6.3  The  translation  rules 

The  cells  of  §5  were  useful  for  an  intuitive  explanation,  but  in  rules  it  is  simpler  to 
produce  equations  directly. 

For  a  nothing  statement,  we  connect  the  incoming  control  to  the  termination 
continuation  wire. 

p  I-  nothing  ->p.c°<=p.c 

For  a  halt  statement,  we  connect  the  incoming  control  to  the  halt  continuation  wire,  to 
signal  halting  to  an  enclosing  parallel  statement.  We  allocate  a  new  selection  wire  s' 
defined  as  a  register  with  input  as  explained  in  §5.  We  connect  it  to  the  environment 
selection  wire  p.s. 


p.  c1  <=  p.c  +  (p.a*p.s) 


pb  halt 


p.s<=s' 

s':  =  (p.c  +  (p.a*p.s))*~ i  p.i. 


For  an  emit  statement,  we  connect  the  incoming  control  to  the  termination  wire 
and  to  the  signal  wire. 


pbemit  S 


p.c0  <=  p.c 
p.S  <=  p.c 


For  a  sequence,  we  allocate  a  new  wire  c!  for  control  transmission.  We  translate  the 
first  statement  with  c'  as  termination  and  the  second  statement  with  d  as  incoming 
control. 


p[c°  <-c']  hstat,  -*■  Cx 
p  Lc  <-  c']  I-  stat2  -*  C2 

cT 

p\-  staty)  stat2-+ 


For  a  loop,  we  allocate  a  new  wire  d  to  handle  looping  and  we  connect  the  incoming 
control  to  it.  We  translate  the  body  with  d  both  as  incoming  control  and  as  outgoing 
continuation. 


p  [c  <-  d,  c° <-  d  ]  I-  stat  ->  C 


pb  loop  stat  end 


d  <=  p.c 
C 


For  a  present  statement,  we  allocate  two  new  control  wires  c1  and  c2 ;  then  c1  is  set 
when  the  incoming  control  is  present  and  the  signal  is  present,  while  c2  is  set  when  the 
incoming  control  is  present  and  the  signal  is  absent.  We  translate  the  branches  with  cq 
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and  c2  as  respective  incoming  controls. 


\-stat,  -*Cl 
p  [c  -« —  c2  ]  I —  stat2  -*C2 


p  I-  present  S  then  stat ,  else  stat2  end-* 


c1  —  p.c*p.S 
c2  =  p.c*~ i  p.S 


For  a  watching  statement,  we  allocate  a  new  selection  wire  s'  and  connect  it  to  p.s,  and 
we  allocate  a  new  activation  wire  a'.  The  outgoing  activation  wire  a'  is  set  if  s'  and  p.a 
are  set  and  the  signal  is  absent.  The  outgoing  termination  wire  p. c°  is  set  if  s'  and  p.a  are 
set  and  the  signal  is  present. 


p  [s  <-  s',  a «-  a'  ]  I-  stat  -*  C 


p.s<=s' 


a'  =  p.a* p.s*  i  p.S 
p.c°  <=p.a*p.s*p.S 


p  I- do  stat  watching  S-> 


C 


The  parallel  rule  is  of  course  the  most  complex  one.  It  follows  exactly  the  intuitive 
explanation  given  in  §  5.  We  allocate  a  selection  wire  s'  connected  to  p.s,  an  inhibition 
wire  /',  a  continuation  vector  c'  of  the  same  length  k  as  p. c,  and  a  priority  vector  p  of 
length  k  —  1.  We  recursively  translate  the  body  with  the  new  selection,  inhibition,  and 
continuation  wires.  Then  we  establish  the  priority  queue  to  compute  the  outgoing 
continuations  and  the  new  inhibition  wire  i'. 


k  =  \p.c\ 

p  [s  <-  s',  i  <-  f,  c  <—  c  ]  h~  staf  -*  Cj 
p[s*~  s',  i *-  i', c  c']  I- stat2  ->  C2 


p.s<=s' 


phstat^\\stat2 -> 


p.i  if  k  ^  3 
p.i  +  p1  if  k  >  3 


C 


For  a  trap,  we  shift  by  1  all  wires  in  p.c  after  position  2  and  we  insert  the  termination 
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continuation  p.cn  at  exit  position  2.  The  vector  notations  are  obvious. 

p  [c  <-  (p.  c°,  p.c 1 ,  p.c° )  •  p.c2  -  ]  b  stat  -+  C 
p I- trap  T  in  stat  end->C 

For  an  exit,  we  connect  the  incoming  control  to  the  appropriate  continuation, 
pbexit  T*  -> p.ck<=p.c. 

For  a  local  signal  declaration,  we  simply  translate  the  body  since  the  signals  have  been 
pre-allocated. 


p  b  stat  — ►  C 

p  b  signal  S  in  stat  end  ->  C 


7.  Correctness  of  the  translation 

We  first  explain  roughly  the  proof  idea  as  if  the  translation  was  always  correct. 
Consider  the  body  stat  of  a  correct  module  placed  in  the  initial  environment  where  the 
local  signal  wires  have  been  cut.  Then  there  are  two  separate  wires  for  each  local  signal, 
one  for  input  and  one  for  output.  Consider  a  signal  environment  E  and  a  haltset  H. 
There  exists  a  unique  behavior  stat  E  A£H>  stat'  with  stat '  =  R  ( statH ’),  and  a  unique 
behavior  R{statH)  E  •1~">  stat"  with  stat"  —  R  {statH  );  uniqueness  is  obvious  since 
there  are  no  local  signal  declarations  in  statv 
The  circuit  fragment  C{stat)  obtained  by  translating  stat 1  has  two  incoming  control 
wires  c  and  a.  Then  setting  c  realizes  the  first  behavior,  while  setting  the  activation  wire 
a  realizes  the  second  behavior.  Furthermore,  because  of  loops,  c  and  a  can  be  both  set. 
Then  the  circuit  sums  up  both  behaviors  with  no  interference  between  them.  The  proof 
goes  simply  by  structural  induction. 

Once  this  is  shown,  close  the  local  signal  wires.  Then,  for  the  module  body  stat ,  for 
any  state  H  and  real  input  event  /,  there  exists  a  unique  local  event  L  and  a  unique 
output  event  0  such  that 


R(statH) 


— 0A-H  ->  R  (statH ) 

I\jL  uOu{tick} 


But  closing  the  local  signal  wires  in  the  circuit  has  exactly  the  same  coherence  effect  as 
in  the  semantics:  a  signal  is  there  if  and  only  if  it  is  emitted.  Since  the  circuit  can  do 
nothing  but  mimic  the  behavioral  semantics  and  since  there  is  only  one  fixpoint  in  the 
semantics  by  the  correctness  hypothesis,  there  is  only  one  fixpoint  in  the  circuit  and  it  is 
the  required  one4. 

Therefore,  one  can  view  the  circuit  as  a  folding  of  all  possible  behavioral  semantics 
proof  trees  of  a  program  and  of  its  residuals  in  all  possible  environments.  What  the 
electrons  do  is  to  select  the  right  proof  tree  in  one  clock  cycle  given  a  residual  and  an 
input. 


4  We  talk  here  of  abstract  circuits,  or  equivalently  we  assume  that  concrete  circuits  always  find 
the  unique  fixpoint  when  it  exists. 
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The  only  problem  with  the  above  proof  argument  is  that  sending  control  to  a  parallel 
by  both  c  and  a  does  not  sum  up  the  behaviors:  one  of  the  continuations  can  be 
discarded  by  the  other  one.  Here,  we  shall  simply  prove  that  the  circuit  works  fine  under 
the  assumption  that  the  problem  can  never  appear  dynamically5.  This  leads  to  the 
following  condition: 

Condition  NSP.  A  correct  program  is  said  to  be  NSP  (Non  Schizophrenic  for  Parallels) 
if  for  any  haltset  H  and  for  any  event  E,  no  parallel  subterm  stat  =  stat \  ||  stat2  that 
contains  a  halt  in  H  is  evaluated  in  the  behavioral  semantics  proof  of  the  reaction  of  the 
module  body  under  E  both  under  the  form  stat  and  under  the  form  R(statH+). 

This  is  certainly  a  strange  and  non-structural  condition,  but  its  main  advantage  is  to 
be  amazingly  trivial  to  check  in  the  ESTEREL  software  compiling  process.  We  have  put 
an  appropriate  specific  option  in  the  ESTEREL  compiler  to  report  its  failure. 

Theorem.  For  any  correct  NSP  ESTEREL  module  M ,  the  circuit  C(M)  has  exactly  the 
same  input-output  behavior  as  M. 

Proof.  The  proof  goes  just  as  sketched,  but  we  must  inductively  ensure  that  no 
parallel  receives  c  and  a  together. 

We  first  study  the  circuit  reactions  when  the  local  signal  wires  are  opened.  We 
consider  a  given  haltset  H  and  a  given  input  event  E.  Let  P  be  the  proof  of 
R(statH)^^R(statH). 

Given  a  subterm  stat ,  of  stat ,  define  the  type  of  stat ,  in  P  as  follows:  stat 1  is  of  type 
null  if  it  does  not  appear  in  P,  of  type  c  if  it  appears  in  P  only  under  the  form  stat 1 ,  of 
type  a  if  it  appears  in  P  only  in  the  form  R  ( statH+  ),  and  of  type  ca  if  it  appears  in  both 
forms. 

For  the  circuit  C(stat , )  generated  by  stat , ,  we  say  that  we  send  the  control  null  if  we 
set  neither  c  nor  a,  the  control  c  if  we  set  c  and  not  a,  the  control  a  if  we  set  a  and  not  c 
while  s  is  set,  and  the  control  ca  if  we  set  both  c  and  a  while  s  is  set. 

We  show  the  following  properties  on  any  subterm  stat , ,  by  structural  induction: 

(a)  If  stat ,  receives  the  control  as  indicated  by  its  type  in  P,  then  it  will  itself  send  the 
control  to  all  its  subterms  as  indicated  by  their  type  in  P. 

(b)  Under  control  null,  C(stat^)  sets  no  continuation,  no  signal,  and  no  halt. 

(c)  If  stat «  is  of  type  c  and  stat ,  ■  stat\ ,  then,  under  control  c,  C(stat , )  emits  E'a, 

sets  the  sole  continuation  c  °,  and  sets  exactly  the  halts  in  Hc  iff  its  incoming 
inhibition  wire  i  has  value  0. 

(d)  If  staty  is  of  type  a  and  R(stat"+)  stat\ ,  then,  under  control  a,  C(stat^) 

emits  E'a,  sets  the  sole  continuation  cka,  and  sets  exactly  the  halts  in  Ha  iff  its 
incoming  inhibition  wire  i  has  value  0. 

(e)  If  ste/j  is  of  type  ca,  then,  under  control  ca,  C(stat ,)  realizes  the  union  of  the 
behaviors  of  case  (c)  and  (d). 


5  The  right  solution  would  be  to  use  two  synchronizers,  one  for  c  and  one  for  a,  and  to  duplicate 
some  of  the  logic  of  the  body  to  signal  termination  to  the  appropriate  synchronizer;  in  fact,  one 
must  use  more  than  two  synchronizers  in  the  general  case  to  properly  handle  parallel  statement 
nesting;  this  will  be  the  subject  of  a  forthcoming  paper. 
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First  notice  some  general  facts.  The  s  wire  is  set  for  stat ,  iff  stat,H+.  Hence  only 
statements  that  contain  halts  in  H  will  receive  both  a  and  s.  By  construction,  any  circuit 
C{stat , )  does  nothing  under  control  null  and  sets  no  halt  when  its  incoming  inhibition 
wire  i  is  set;  otherwise,  its  sets  its  halts  normally.  Also,  since  all  statements  merge  their 
emitted  signals  by  or  gates,  the  signal  behavior  will  always  be  the  expected  one. 

The  statements  nothing,  emit  S,  and  exit  T  are  always  of  type  null  or  c  and  they 
exhibit  the  (c)  behavior  under  c.  A  halt  can  be  of  any  type,  but  it  always  sets  c1  and  its 
register  if  /  =  0  as  required  under  control  c,  a,  or  ca. 

Consider  a  sequence  stat,;  stat2  of  type  c..  Then  stat,  is  itself  of  type  c,  and  the 
induction  tells  that  C(stat, )  behaves  just  as  stat,  under  c.  If  stat,  terminates,  then  stat2 
is  of  type  c  since  the  first  sequence  rule  must  be  applied  in  the  proof  (it  cannot  be  of  type 
ca,  otherwise  the  sequence  itself  would  be  of  that  type).  But  C(stat, )  sets  c°  that  starts 
stat2  under  control  c  by  the  sequence  wiring.  The  induction  shows  (c).  If  stat,  does  not 
terminate,  then  stat2  is  of  type  null  and  C(stat2)  receives  no  control  and  does  nothing; 
hence  the  sequence  behaves  just  as  stat,,  which  shows  (c).  Condition  (a)  also  follows 
from  this  case  analysis. 

The  proof  of  (d)  and  (a)  is  similar  for  a  sequence  of  type  a,  analysing  separately 
the  cases  stat"+  and  stat2+. 

Consider  finally  a  sequence  of  type  ca.  First  assume  stat"+.  Then  stat,  itself  is  of  type 
ca,  and  the  induction  applies  to  it.  Furthermore,  stat2  is  started  under  c  iff  stat, 
terminates  under  c,  a,  or  both.  But  giving  twice  the  control  to  stat2  is  just  the  same  as 
giving  it  once,  since  incoming  control  wires  are  gathered  by  an  or  gate,  and  (e)  follows. 
Next  assume  stat  2+.  Then  stat,  is  of  type  c,  while  stat2  is  of  type  a  if  stat,  does  not 
terminate,  making  (e)  obvious,  and  of  type  ca  if  stat,  terminates;  in  the  latter  case,  (e)  is 
established  by  induction  on  stat2.  The  case  analysis  is  finished  for  the  sequence,  and  it 
also  shows  (a)  in  all  cases. 

The  other  operators  are  handled  in  the  same  way.  Fora  parallel,  one  is  never  in  case 
(d)  by  the  NSP  hypothesis,  and  one  remembers  that  the  i  wire  is  set  in  case  of  exit  to  kill 
the  haltsets  of  the  subterms. 

Finally,  as  explained  before,  the  circuit  is  forced  to  compute  the  same  fixpoint  as  the 
behavioral  semantics  when  closing  the  local  signal  wires.  To  finish  the  proof,  just  notice 
that  the  module  body  stat  receives  c  at  the  first  instant  from  the  boot  wire  and  a  at  the 
next  instants  from  the  selection  wire  that  is  plugged  back  as  the  activation  wire. 


8.  Implementation 

8.1  Actual  implementation  on  PERLEO 

We  have  experimented  our  hardware  implementation  on  the  PERLEO  board  developed 
at  DEC  PRL  (Bertin  et  al  1989).  It  consists  of  a  set  of  25  synchronous  XILINX 
programmable  logic  cell  arrays  placed  on  a  board  and  piloted  by  a  sun™  workstation. 

The  translation  is  performed  by  the  strldg  processor  (ESTEREL  to  digital),  which  is 
integrated  in  the  standard  ESTEREL  compiler6.  The  generated  logical  circuit  is  printed 
out  in  perleO  format  and  translated  into  xilinx  native  format  by  the  PERLEO  software 


6 In  fact,  most  of  the  skeleton  and  continuation  analysis  is  already  done  by  the  standard 
compiler  first  pass. 
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(we  could  as  well  produce  portable  formats  such  as  PALASM).  The  logical  circuit  is  then 
given  to  optimizers  and  the  optimized  result  is  fed  into  an  automatic  placer-router, 
without  any  pre-placing  indication.  This  gives  a  XILINX  circuit  specification.  Using  this 
environment,  the  turnover  is  of  the  order  of  15  minutes  from  source  program  to 
running  circuit  for  a  medium-size  program. 

On  PERLEO,  we  provide  a  symbolic  debugging  and  exact  speed  measure 
environment,  with  interactive  symbolic  input  and  output  from  within  Lisp  or  C.  The 
speed  measure  reports  at  which  maximal  clock  speed  a  circuit  correctly  handles  a 
benchmark.  In  practice,  the  speed  is  30  to  75  nanoseconds  for  a  small  program  (30  ns  for 
the  circuit  presented  in  the  appendix),  and  75  to  100  nanoseconds  for  a  medium  size 
program  that  still  fits  into  a  single  chip  (about  2-4  pages  of  source  ESTEREL  code),  this 
on  a  3020  XILINX  chip. 

In  debug  or  speed-measure  mode,  the  ESTEREL  program  is  implemented  on  a  single 
chip  and  other  chips  are  devoted  to  bus  and  debug  interfaces.  The  applications  we  have 
handled  so  far  are  man-machine  interfaces,  real-size  local  area  network  controllers 
(Mejia  Olvera  1989),  and  various  circuit  controllers  including  those  used  in  the  perleO 
board  itself  to  communicate  with  the  bus  and  with  the  tested  program. 

8.2  Simulation  and  correctness  proofs 

esterel  and  lustre  are  themselves  able  to  describe  digital  hardware.  The  strldg 
processor  is  also  able  to  unparse  the  circuit  in  ESTEREL  or  LUSTRE.  There  are  two  main 
uses: 

•  After  compiling  the  ESTEREL  version  of  the  circuit,  we  can  used  the  full  ESTEREL 
programming  environment  to  perform  simulations,  analysis,  and  optimizations. 

•  Once  the  circuit  behavior  automaton  is  generated  by  ESTEREL  compiler,  we  can  use 
the  AUTO  verification  system  to  automatically  check  for  equivalence  between  the 
source  code  and  the  circuit  automata  (Boudol  et  al  1990).  This  may  seem 
unnecessary  since  the  translation  has  been  mathematically  proved  correct,  but 
software  is  software  and  double-checks  are  always  useful.  Furthermore,  the 
translation  can  work  properly  even  if  the  sufficient  correctness  conditions  are  not 
met.  If  auto  reports  equivalence,  the  circuit  is  perfectly  usable  even  if  it  works  by 
chance! 

Of  course,  using  the  ESTEREL  standard  compiler  for  such  a  circuit  unparsing  analysis 
makes  sense  only  if  the  circuit  has  a  reasonable  number  of  states,  say  50  to  500,  which  is 
usually  the  case  for  controllers. 


9.  Conclusion 

Although  ESTEREL  was  not  at  all  designed  as  a  hardware  description  language,  the 
work  presented  here  shows  it  is  well-suited  to  very  high-level  verified  hardware 
generation.  The  hardware  implementation  is  directly  based  on  the  formal  semantics. 
The  electrons  circulating  in  the  wires  perform  the  computation  of  the  proof  tree 
associated  with  a  program  and  an  input  within  a  single  clock  cycle.  The  circuit  itself  can 
be  viewed  as  a  folding  of  all  possible  semantical  proof  trees  into  a  graph  structure. 
The  translation  we  have  presented  is  not  general  since  programs  are  assumed  to  obey 
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a  sufficient  NSP  condition;  we  are  now  in  the  process  of  releasing  a  full  correct 
translation  of  ESTEREL  into  circuits,  based  on  extensions  of  the  same  ideas. 

We  investigate  three  main  kinds  of  applications:  implementing  existing  ESTEREL 
programs  on  hardware  to  improve  their  performance,  using  ESTEREL  to  directly 
program  hardware  controllers,  and  using  ESTEREL  to  build  reference  controllers  to 
which  actual  hand-tailored  controllers  can  be  automatically  proved  equivalent.  Our 
present  experiments  are  very  promising  and  leave  place  for  sophisticated  optimization. 

To  our  knowledge,  the  closest  related  works  are  the  hardware  implementation  of 
LUSTRE  and  SML  (Clarke  et  al  1991).  The  LUSTRE  and  ESTEREL  implementations  are 
developed  in  parallel  and  are  fully  compatible.  Compared  to  SML,  ESTEREL  is  much 
more  elaborate  as  a  programming  language,  having  in  particular  watchdogs, 
exceptions,  and  instantaneous  broadcast.  Our  implementation  is  direct  and  does  not 
use  a  translation  to  automata,  although  such  a  translation  is  also  available.  LUSTRE, 
SML,  and  ESTEREL  all  give  access  to  temporal  logic  or  process  calculi  based  verifiers.  We 
need  more  experience  to  compare  the  relative  qualities  of  the  languages  and  of  their 
verification  tools. 


This  work  was  motivated  by  discussions  with  Jean  Vuillemin  and  Patrice  Bertin  from 
DEC  Paris  Research  Laboratory.  It  owes  much  to  the  work  of  Georges  Gonthier  on  the 
semantics  of  ESTEREL.  The  actual  implementation  on  PERLEO  was  done  at  DEC  PRL 
under  the  supervision  of  Patrice  Bertin,  who  provided  invaluable  help.  The 
experiments  with  BDD  optimizers  were  conducted  by  Olivier  Coudert  and  Jean- 
Christophe  Madre  (BULL  CRG),  as  well  as  by  Herve  Touati  (DEC  PRL). 


Appendix  A. 

Al.  A  simple  bus  interface  example 

As  a  toy  application  example,  we  program  an  interface  module  between  a  bus  and  a 
hardware  application.  This  interface  is  a  slight  simplification  of  the  one  effectively  used 
in  the  PERLEO  board  to  run  actual  ESTEREL  program  hardware  translations.  Although 
the  program  is  very  small,  we  use  submodules  to  illustrate  modular  programming. 

A  1.1  The  interface  informal  specification 

The  interface  repeatedly  waits  for  input  from  the  bus,  tells  the  application  to  store  the 
corresponding  data  word,  triggers  a  computation,  and  tells  the  application  to  send 
back  the  output  data  word  to  the  bus  when  the  computation  is  terminated  and  the  bus 
is  ready  for  output. 

The  interface  receives  two  signals  from  the  bus,  BUS-WRITE  for  input  and 
BUS_ READ  for  output.  It  acknowledges  both  input  and  output  by  sending  back 
BUS-ACK. 

Data  words  are  received  or  emitted  directly  by  the  application.  To  control  data 
input,  the  interface  tells  the  application  to  connect  its  input  buffers  to  the  data  bus  by 
setting  a  signal  OPEN-INPUT.  This  signal  is  maintained  until  the  arrival  of 
BUS -WRITE  included.  After  one  clock  cycle,  the  interface  sends  BUS-ACK  and  starts 
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the  computation  by  sending  a  signal  GO  to  the  application.  When  the  computation  is 
terminated,  the  application  sends  back  a  signal  FINISHED.  The  output  data  is  then 
ready  in  the  application  output  buffers.  The  interface  tells  the  application  to  connect  its 
output  buffers  to  the  bus  by  sending  a  signal  OPEN  .OUTPUT.  This  can  be  done  only 
when  the  computation  is  Finished  and  when  the  bus  has  sent  BUS. READ.  After 
waiting  a  clock  cycle  for  the  data  to  be  effectively  present  on  the  bus,  the  interface  sends 
BUS.  ACK. 

In  addition,  we  assume  that  the  bus  can  send  at  any  time  a  RESET  signal  telling  the 
interface  to  reset  itself  to  its  initial  state. 

A  1.2  The  interface  ESTEREL  program 

The  interface  module  is  written  as  follows: 
module  Interface: 

input  BUS-READ,  BUS-WRITE,  RESET; 
output  BUS-ACK; 

output  OPEN-INPUT,  OPEN-OUTPUT,  GO; 
input  FINISHED; 

loop 
loop 

run  Input; 

run  ComputeAndOutput 
end 

each  RESET. 

Notice  that  the  RESET  signal  is  completely  factored  out  and  effectively  resets  the 
interface  independently  of  its  current  internal  state. 

The  Input  submodule  is  written  as  follows: 

module  Input: 
input  BUS-WRITE; 
output  BUS-ACK; 
output  OPEN-INPUT; 
trap  INPUT  in 
sustain  OPEN-INPUT 

II 

await  BUS-WRITE  do 
end; 

await  tick; 
emit  BUS_ACK. 

Here  we  use  a  trap  construct  to  ensure  that  OPEN -INPUT  is  emitted  when 
BUS-WRITE  is  received.  One  could  write  as  well: 

do 

sustain  OPEN-INPUT 
watching  BUS.WRITE; 
emit  OPEN-INPUT; 

By  the  semantics  of  the  watching  construct,  the  statement  “sustain  OPEN  -  INPUT”  is 


%  from  bus 
%  to  bus 
%  to  application 


exit  INPUT  end 


%  from  bus 
%  to  bus 
%  to  application 
%  from  application 
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not  executed  when  BUS -  WRITE  occurs.  This  is  why  OPEN  _  I N  PUT  must  be  explicitly 
emitted  at  that  instant. 

The  ComputeAndOutput  module  is  written  as  follows: 

module  ComputeAndOutput: 
input  BUS_READ; 
output  BUS-ACK; 
output  GO,  OPEN-OUTPUT; 
input  FINISHED; 

[ 

await  BUS_READ 

II 

emit  GO; 
await  FINISHED; 

]; 

emit  OPEN-OUTPUT; 
await  tick; 
emit  BUS-ACK. 

Notice  how  the  parallel  statements  realize  the  synchronization:  it  terminates  exactly 
when  the  computation  is  finished  and  the  bus  is  ready  to  read. 

Once  optimized,  placed,  and  routed,  the  circuit  uses  up  9  cells  on  a  XILINX  3020 
circuit.  There  are  5  registers  and  1 1  logical  functions  with  a  total  of  35  inputs. 

A  1.3  The  advantages  of  ESTEREL 


%  from  bus 
%  to  bus 
%  to  application 
%  from  application 


The  automaton  generated  by  the  ESTEREL  compiler  is  pictured  in  figure  A 1 .  Notice  the 
diamond  generated  by  the  parallel  statement  that  appears  in  ComputeAndOutput. 


RESET? 
OPEN-  INPUTI 
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Notice  also  the  reset  arrows  that  go  from  any  state  into  state  1:  they  are  all  generated  by 
the  single  “loop-each  RESET”  statement.  Of  course,  such  a  small  automaton  can  be 
easily  designed  by  hand.  The  advantage  of  ESTEREL  programming  really  appears  for 
more  complex  controllers.  The  modularity  of  the  language,  its  built-in  concurrency, 
and  the  power  of  its  control  structures  allow  the  user  to  build  controllers  by  assembling 
individually  simple  modules  into  bigger  ones.  For  example,  to  perform  speed 
benchmarks  on  PERLEO,  we  use  a  variant  of  the  bus  interface  that  inputs  two  data  words 
and  performs  computation  and  output  twice  in  a  row.  To  obtain  this  interface,  one  just 
changes  the  Interface  module  body  into  (roughly). 

run  Input  [signal  OPEN_INPUT_1/OPEN_INPUT]; 

run  Input  [signal  OPEN_INPUT_2/OPEN_INPUT]; 

run  ComputeAndOutput  [signal  OPEN  _OUTPUT_1/OPEN_OUTPUT, 

GO  _1  /GO]; 

run  ComputeAndOutput  [signal  QPEN_QUTPUT_2/OPEN_OUTPUT, 
GO_2/GO] 

Usually,  a  relatively  simple  change  to  a  specification  involves  a  simple  and  local 
change  to  an  ESTEREL  program.  This  is  definitely  not  true  of  finite  automata,  which 
are  highly  unstable  w.r.t.  specification  changes.  We  strongly  believe  that  programming 
controllers  in  ESTEREL  is  one  order  of  magnitude  simpler  that  designing  finite  state 
machines  by  hand. 
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Abstract.  A  distributed  computer  system  consists  of  different  processes 
or  agents  that  function  largely  autonomously  and  coordinate  their  actions 
by  communicating  with  each  other.  In  such  a  situation,  actions  may  be 
performed  by  different  agents  of  the  system  locally,  in  a  concurrent  manner. 

In  this  paper,  we  first  discuss  formal  models  of  distributed  systems  in 
which  concurrency  is  specified  explicitly,  in  contrast  to  more  traditional 
approaches  where  concurrency  is  represented  implicitly  as  a  nondeter- 
ministic  choice  between  all  possible  sequentializations  of  concurrent 
actions.  This  naturally  leads  to  models  based  on  partially-ordered  sets  of 
actions  rather  than  sequences  of  actions  and  is  often  called  the  true 
concurrency  approach.  The  models  we  focus  on  are  distributed  transition 
systems,  elementary  net  systems  and  event  structures. 

In  the  second  half  of  the  paper,  we  develop  a  family  of  logics  to  specify 
and  reason  about  the  behavioural  properties  of  the  models  we  have 
described.  The  logics  we  define  are  extensions  of  temporal  logic  with  new 
modalities  to  directly  describe  concurrency. 

This  paper  is  essentially  a  survey  of  work  done  by  the  authors  during 
the  last  few  years  on  modelling  distributed  systems  with  true  concurrency 
and  using  logic  to  reason  about  these  models.  The  emphasis  is  on 
motivating  definitions  through  examples  and  on  presenting  major  results, 
without  going  into  too  many  formal  details.  We  provide  pointers  to  the 
literature  where  these  details  can  be  found. 

Keywords.  Concurrency;  temporal  logic;  distributed  systems;  logics  of 
programs. 


1.  Introduction 

The  study  of  distributed  systems  and  computations  is  an  important  topic  of  research 
in  computer  science.  A  distributed  system  consists  of  a  number  of  essentially 
autonomous  components  that  work  together  to  perform  a  complex  task. 

A  computer  network  which  brings  together  a  heterogeneous  collection  of  computing 
resources  and  users  dispersed  over  a  wide  geographic  area  is  a  classic  example  of  a 
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distributed  system.  Distributed  databases  constitute  yet  another  class  of  examples.  At 
a  lower  level,  computer  protocols  which  facilitate  efficient  and  reliable  transmission 
of  electronic  data  and  operating  systems  which  coordinate  the  activities  of  multiple 
processes  (programs)  in  the  presence  of  multiple  processors  can  also  be  viewed  as 
distributed  systems.  With  the  advent  of  VLSI  systems,  the  notion  of  a  distributed 
system  is  also  becoming  relevant  at  the  circuit  level. 

The  theory  of  distributed  systems  consists  of  formulating  abstract  mathematical 
models  of  distributed  systems  and  studying  the  properties  of  these  models.  A  basic 
motivation  in  the  study  of  formal  models  is  to  develop  tools  and  techniques  using 
which  one  can  specify,  analyse  and  implement  distributed  systems.  Another  goal  is 
to  develop  formal  means  for  reasoning  about  the  behaviour  of  distributed  systems.  This 
is  important  because  one  would  like  to  ensure  that  a  specification  is  in  some  sense 
consistent  before  one  attempts  an  analysis  or  an  implementation.  Even  more 
importantly,  one  would  like  to  guarantee  that  a  proposed  implementation  indeed 
meets  the  requirements  of  a  specification. 

In  this  paper,  we  present  some  of  our  work  in  the  last  few  years  on  modelling 
distributed  systems  with  true  concurrency,  using  logic  to  reason  about  these  models. 
The  emphasis  is  on  motivating  definitions  through  examples  and  on  presenting  major 
results.  No  attempt  will  be  made  to  go  into  formal  details;  we  shall  provide  pointers 
to  the  literature  where  these  details  can  be  found. 

In  the  first  part  of  the  paper,  we  introduce  three  models  called  distributed  transition 
systems,  elementary  net  systems  and  event  structures.  Using  these  models,  we  illustrate 
some  of  the  fundamental  features  of  distributed  systems,  such  as  causality,  choice  and 
concurrency. 

In  the  second  half  of  the  paper,  we  develop  a  family  of  logics  to  specify  and  reason 
about  the  behavioural  properties  of  the  models  considered  in  the  first  half  of  the  paper. 


2.  Models  for  true  concurrency 

Typically,  a  distributed  system  consists  of  spatially  separated  processes  or  agents 
performing  a  joint  task.  The  agents  function  largely  autonomously  and  coordinate 
their  actions  by  communicating  with  each  other.  In  such  a  situation,  actions  may  be 
performed  by  different  agents  of  the  system  locally,  in  a  concurrent  manner. 

Informally,  we  say  that  two  events  are  concurrent  if  they  occur  with  no  .  priori 
ordering  over  their  occurrences.  This  is  in  contrast  to  a  sequential  system  in  which 
any  two  events  that  occur  in  a  computation  must  be  ordered. 

In  addition  to  concurrency,  two  other  aspects  are  of  interest  in  the  theory  of 
distributed  systems  -  causality  and  choice.  Causality  refers  to  the  fact  that  certain 
events  in  a  distributed  system  can  only  occur  in  a  fixed  order;  for  example,  a  message 
can  be  received  only  after  it  has  been  sent.  The  receipt  of  a  message  is  said  to  be 
causally  dependent  on  the  sending  of  the  message. 

Choice  captures  the  fact  that  systems  can  behave  in  an  indeterminate  fashion.  In 
other  words,  at  certain  points  of  the  computations,  the  system  may  choose  between 
alternative  events,  leading  to  different  behaviours. 

As  we  shall  see,  labelled  transition  systems  are  simple  and  convenient  models  of 
sequential  systems  which  can  explicitly  describe  causality  and  choice,  but  which  do 
not  have  a  natural  way  of  representing  concurrency.  One  way  of  describing 
concurrency  in  the  framework  of  transition  systems  is  in  terms  of  indeterminacy.  In 
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this  approach,  the  fact  that  a  set  of  actions  may  be  performed  concurrently  is 
represented  by  permitting  the  system  to  choose  between  all  possible  sequentializations 
of  the  actions.  This  approximation  of  concurrency  by  interleaving  is  used  in  various 
algebraic  approaches  to  the  theory  of  distributed  systems  such  as  a  calculus  for 
communicating  systems  (CCS)  (Milner  1989),  communicating  sequential  processes  (CSP) 
(Hoare  1984)  and  algebra  of  communicating  processes  (ACP)  (Bergstra  &  Klop  1984). 

Such  an  implicit  representation  of  concurrency  leads  to  problems  in  analysing 
system  behaviour,  due  to  the  combinatorial  explosion  in  the  number  of  possible 
interleavings.  We  follow  an  alternative  approach,  called  “true  concurrency”,  where 
concurrency  is  represented  explicitly  in  the  models. 

Many  abstract  models  of  distributed  systems  have  been  suggested  which  explicitly 
deal  with  the  phenomena  of  causality,  choice  and  concurrency.  Here,  we  shall  consider 
three  of  these  models  -  distributed  transition  systems,  elementary  net  systems  and 
event  structures.  We  shall  also  discuss  a  model  called  communicating  sequential 
agents.  This  model,  based  on  a  restricted  class  of  event  structures,  captures  in  a  natural 
way  the  intuitive  picture  of  a  distributed  system  as  a  collection  of  sequential  agents 
coordinating  their  actions  through  communication. 

2.1  Distributed  transition  systems 

Before  discussing  models  of  concurrent  systems,  let  us  briefly  look  at  sequential 
systems.  Transition  systems  are  a  basic  model  of  sequential  systems. 

DEFINITION  1.1 

A  {L-labelled)  transition  system  is  a  triple  TS  =  (S,  X,  -»)  where 

(1)  S  is  a  set  of  states. 

(2)  X  is  a  set  of  actions. 

(3)  -*  c=  S  x  X  x  S  is  the  transition  relation. 

If  (s,  a,s')e-+,  then  the  idea  is  that  the  action  a  can  occur  at  state  s  and  after  the 
execution  of  a  the  system  assumes  the  state  s'.  We  shall  often  write  s-^s'  instead 
of  (s,a,s')e->. 

Figure  1  is  a  graphical  representation  of  a  transition  system.  The  nodes  of  the  graph 
represent  the  states  of  the  system.  The  edges,  labelled  by  actions  from  X,  reflect  the 
transition  relation 

Clearly  the  structure  of  a  transition  system  captures  both  the  basic  phenomena 
present  in  sequential  systems  -  causality  and  choice.  The  transition  relation  can  be 
used  to  determine  the  causal  dependencies  between  system  states.  Choice  is  specified 
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by  branching  in  the  transition  system.  In  other  words,  if  sAs'  and  s^*s"  both 
belong  to  the  transition  relation,  then  the  system  at  state  s  can  choose  between  the 
actions  a  and  b.  For  example,  at  sx  the  system  shown  in  figure  1  can  either  move  by 
an  a  to  s2  or  move  by  a  b  to  s3.  In  general,  different  choices  available  to  the  system 
at  a  state  may  be  labelled  by  the  same  action.  In  other  words,  the  behaviour  could 
be  nondeterministic.  For  instance,  at  s0  this  system  can  move  on  b  either  to  s5  or  to  s3 . 

In  this  example,  starting  at  s3,  either  the  action  a  can  occur  followed  by  the  action 
b  or  the  action  b  can  occur  followed  by  the  action  a.  In  the  interleaving  approach 
to  concurrency,  this  situation  often  amounts  to  saying  that  a  and  b  can  occur 
concurrently  at  sx. 

However,  we  would  like  to  maintain  a  clear  distinction  between  nondeterminism 
and  concurrency.  Hence,  to  describe  concurrency  in  a  transition  system,  we  enrich 
the  relation  ->  by  permitting  a  transition  to  be  labelled  by  a  finite  set  of  actions  from 
Z,  rather  than  just  by  a  single  action.  Thus,  we  will  now  have  elements  in  — ►  of  the 
form  s  As',  where  u  is  a  finite  subset  of  Z.  The  idea  is  that  the  actions  in  u  can  occur 
at  s  with  no  order  over  their  occurrences.  When  they  have  all  occurred,  the  resulting 
state  is  s'.  The  set  of  actions  u  is  termed  a  concurrent  step. 

Henceforth,  given  a  set  X,  f(X)  denotes  the  set  of  subsets  of  X  and  / ifin(X )  denotes 
the  set  of  finite  subsets  of  X.  We  can  now  formally  define  distributed  transition 
systems  as  follows. 

DEFINITION  1.2 

A  distributed  transition  system  (DTS)  is  a  triple  DTS  —  (S,  Z,  -*•)  where 

(1)  S  is  a  set  of  states; 

(2)  Z  is  a  set  of  actions; 

(3)  -»cSx  y?/in(Z)  x  S  is  the  step  transition  relation  satisfying  for  all  s,  s'  in  S: 

(a)  s^+s'  iff  s  =  s'. 

(b)  for  all  ue/ifin(L),  if  sAs'  then  there  exists  a  function  f:/e{u)->S  such  that 
/(0)  =  s,  f(u)  =  s'  and  for  every  vu  v7eMu)  with  v ,  £u2,  it  is  the  case  that 

We  often  say  that  DTS  =  ( S , Z,  -►)  is  a  DTS  over  Z.  For  convenience,  we  write 
sAs'  instead  of  s-^W. 

The  new  definition  of  -*  is  a  bit  involved  because  we  have  to  ensure  that  any 
nontrivial  “substep”  of  a  concurrent  step  is  also  performed  as  a  concurrent  step.  The 
function  /  in  clause  (3.b)  is  said  to  define  a  u-cube  (from  s  to  s').  The  existence  of  the 
w-cube  guarantees  that  the  mutual  independence  of  the  actions  in  u  holds  for  all  the 
substeps  as  well.  For  example,  figure  2  shows  the  cube  generated  by  a  concurrent 
step  consisting  of  three  events.  To  avoid  cluttering  up  the  figure,  “interior”  arrows 
such  as  f9^+fab  and  fb^\fabc  have  not  been  drawn. 

Figure  3  is  an  example  of  a  distributed  transition  system  modelling  the  allocation 
of  a  shared  resource  to  different  processes  within  a  system.  In  the  example,  we  have 
3  processes  Pl5  P2  and  P3  functioning  in  an  operating  environment  that  supports 
multiprocessing.  The  resource  -  say,  for  example,  blocks  of  memory  -  is  available  in 
“units”.  There  are  totally  5  units  available.  The  3  processes  require  2,  3  and  5  units, 
respectively,  of  the  resource  at  a  time.  In  this  DTS,  Z  =  {al,a2,a3,rx,r2,r3},  where  at 
denotes  that  process  P,  has  been  allocated  the  entire  amount  of  the  resource  that  it 
needs  and  r,  denotes  that  Pf  has  released  the  resource  it  has  been  allocated.  The  states 
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Figure  2.  A  “cube”  generated  by  a 
concurrent  step. 


of  the  DTS  are  ordered  pairs  consisting  of  the  number  of  unallocated  units  of  the 
resource  available  in  the  system  along  with  the  set  of  processes  currently  in  possession 
of  their  required  quota  of  the  shared  resource. 

Thus,  at  the  state  (5,0),  no  processes  are  active  and  all  5  units  of  the  resource  are 
available.  At  this  state,  the  system  can  either  allocate  units  of  the  resource  to  one  of 
the  three  processes,  or  perform  a  concurrent  step  allocating  resources  to  both  Pl  and 
P2.  Notice  that  the  transition  from  (3,  {P x})  to  (2,  {P2})  can  either  be  performed  as 
a  concurrent  step  {a2>ri}  or  by  interleaving  the  two  actions.  In  one  interleaving, 
however,  (5,0)  is  reached  as  an  intermediate  state,  at  which  point  the  resource  can 
be  allocated  to  P3  instead  of  P2.  Thus,  in  this  case,  the  effect  of  the  interleavings  is 
not  quite  the  same  as  that  of  the  concurrent  step. 

In  general,  it  is  important  to  note  that  clause  (3.b)  in  definition  1.2  is  merely  an 
implication.  The  existence  of  a  function  from  f(u)  into  S  which  fulfills  the  stated 
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Figure  3.  A  distributed  transition  system. 
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requirements  does  not  guarantee  the  existence  of  a  concurrent  step.  This  is  in  line  with 
our  philosophy  that  concurrency  should  be  clearly  differentiated  from  interleaving. 
As  we  have  seen  above,  interleavings  may  permit  unintended  deviations  from  the 
behaviour  expected  of  a  concurrent  step.  In  fact,  it  is  possible  to  have  a  concurrent 
step  as  well  as  an  interleaving  of  the  step  performed  at  a  state  but  leading  to  two 
different  states. 

Finally,  we  introduce  the  important  notion  of  reachability  in  a  transition  system. 
Given  TS  —  (S,  X,  -*•)  we  define  ^?(s0),  the  reachability  set  of  s0eS,  as  the  least  subset 
of  S  containing  s0  satisfying: 

If  se8t(s0),  aeS  and  sAs',  then  s'e.^(s0). 

Thus,  ^(so)  is  the  set  of  states  reachable  from  s0  in  a  finite  number  of  steps  using-*. 
2.2  Elementary  net  systems 

In  a  distributed  transition  system,  concurrency  is  explicitly  introduced  into  a  transition 
system  by  permitting  transitions  between  states  via  finite  sets  of  actions  called 
concurrent  steps.  In  effect,  the  notion  of  a  state  is  left  unchanged  and  the  transition 
relation  is  enriched  to  model  concurrency. 

An  alternative  way  of  introducing  concurrency  into  a  transition  system  is  to 
“distribute”  the  states  of  the  system.  The  states  of  a  DTS  correspond  to  the  global 
states  of  the  concurrent  system  being  modelled  by  the  DTS.  Rather  than  regard  these 
global  states  as  indivisible  entities,  We  can  break  them  up  into  atomic  components 
which  can  be  regarded  as  the  local  states  of  the  different  processes  within  the  system. 
The  global  states  of  the  system  can  then  be  characterized  in  terms  of  the  local  states. 

By  distributing  the  states  of  the  model  in  this  manner,  we  can  clearly  distinguish 
concurrency  from  choice  without  having  to  define  a  transition  relation  involving  sets 
of  actions  as  in  a  DTS.  Instead,  the  transition  relation  is  designed  to  capture  the  fact 
that  the  change  of  state  accompanying  each  event  occurrence  in  the  system  is 
“localized”  to  those  processes  that  actually  participate  in  the  event.  As  a  result,  when 
an  event  occurs,  only  specific  local  components  of  the  global  state  are  affected,  leaving 
the  rest  of  the  components  untouched.  Thus,  two  events  that  are  enabled  at  a  global 
state  of  the  system  can  occur  concurrently  if  the  local  states  that  they  affect  are 
disjoint.  On  the  other  hand,  if  the  local  states  affected  by  the  two  events  overlap,  they 
cannot  both  occur  in  the  same  computation  at  that  state  and  so  a  choice  must  be 
made  between  them. 

Net  theory  deals  with  models  of  concurrent  systems  based  on  this  approach.  Here 
we  describe  elementary  net  systems,  which  are  a  basic  model  in  this  theory.  We  begin 
with  the  definition  of  a  net. 

DEFINITION  2.1 

A  net  is  a  triple  N  =  ( B ,  E,  F)  where  B  and  E  are  disjoint  sets  and  F  £  (£  x  E)u(E  x  B) 
satisfies: 


VxeBu E:3yeBu E:(x,y)eF  or  ( y,x)eF . 

The  elements  or  B  are  called  conditions  and  are  used  to  denote  atomic  local  states. 
The  elements  of  E  are  called  events  and  are  used  to  represent  atomic  actions.  The 
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flow  relation  F  models  a  fixed  neighbourhood  relation  between  the  conditions  and 
events  of  a  system.  This  flow  relation  determines  the  way  in  which  the  atomic  actions 
affect  the  atomic  local  states.  The  restriction  on  F  in  the  definition  of  a  net  ensures 
that  there  are  no  isolated  conditions  or  events. 

We  can  now  define  an  elementary  net  system  as  follows. 

DEFINITION  2.2 

An  elementary  net  system  is  a  quadruple  Jf  =  ( B,E,F,cin )  where 

(1)  N^  —  (B,E,F)  is  a  net  called  the  underlying  net  of  JT. 

(2)  cin  c  B  is  the  initial  case. 

Figure  4  is  an  example  of  an  elementary  net  system.  We  have  used  the  conventional 
graphical  notation  for  nets  -  conditions  are  represented  by  circles,  events  by  boxes 
and  the  flow  relation  by  directed  arcs.  The  “marked”  conditions  denote  the  initial 
case  cin. 

For  e  in  E  the  conditions  “pointing  into”  e  via  F  are  called  the  pre-conditions  of 
e  and  are  denoted  by  'e.  Similarly,  the  conditions  “pointing  away”  from  e  via  F  are 
called  the  post-conditions  of  e  and  are  denoted  by  e\  More  formally  we  have 

•ed£  {b\(b,e)eF}, 
e’ =  {b\(e,b)sF}. 

A  state  of  a  net  system,  called  a  case,  consists  of  a  set  of  conditions  c^B.  The 
conditions  in  c  are  said  to  hold  when  the  system  is  at  the  case  c.  Thus,  cin  is  the  set 
of  conditions  that  hold  when  the  system  starts  up. 

The  system  moves  from  one  case  to  another  through  the  occurrence  of  events  from 
E.  An  event  can  occur  at  a  case  iff  all  its  pre-conditions  hold  and  none  of  its 
post-conditions  do  at  the  case.  When  an  event  occurs  all  its  pre-conditions  cease  to 
hold  and  all  its  post-conditions  begin  to  hold. 

In  graphical  terms,  an  event  e  can  occur  at  a  case  c  iff  all  the  conditions  pointing 
into  e  are  “marked”  at  c  and  none  of  the  conditions  pointing  away  from  e  are.  For 


138 


Kamal  Lodaya  et  al 


example,  in  figure  4,  the  event  ey  can  occur  at  the  initial  case  cin  =  {by,b2}.  When  ey 
occurs,  we  “unmark”  all  the  pre-conditions  of  ey  and  “mark”  all  its  post-conditions, 
leaving  the  other  conditions  in  cin  untouched.  Thus,  after  the  occurrence  of  ex,  the 
system  is  at  the  case  {b2,b3}. 

We  can  formalise  this  by  defining  -*N  £  /t(B)  x  E  x  the  elementary  transition 
relation  generated  by  the  net  N  =  (B,  £,  F)  as  follows. 

-^N  =  {(x,c,x')|x-x'  =  ‘e,  x'  —  x  =  e*} 

Using  this  transition  relation,  we  can  associate  a  transition  system  with  an  elementary 
net  system  as  follows. 

DEFINITION  2.3 

Let  Jf  —  ( B ,  E,  F,  cin)  be  a  net  system. 

(1)  C  ,  ,  the  state  space  of  Jf,  is  the  least  subset  of  fi(B)  containing  cin  such  that  if 
ceC  ,  and  (c, e, c')e^N  ,  then  c'eCr. 

(2)  TSa  =  (Cf, £,-*■  ,  )  is  the  transition  system  associated  with  jV,  where  -*■  v,  is  -*N 
restricted  to  C  ,  x  E  x  C  r. 

For  the  net  system  shown  in  figure  4,  {{b1,b2},  {by,b4},  {b2,b 3},  {b3,b4}}  is  its  state 
space. 

Let  Jf  =  (B,E,F,cin)  be  a  net  system  with  ceC  f  and  eeE.  Then  e  is  said  to  be 
enabled  at  c  -  denoted  c[e>  -  iff  there  exists  c'eC  v  such  that  c  Ac',  where  as  usual 
cAc'  abbreviates  (c,e,c')e->  r. 

As  we  had  mentioned  at  the  beginning  of  this  section,  we  can  clearly  separate 
concurrency  from  choice  once  we  have  distributed  the  global  states  of  a  transition 
system  into  local  components. 

Let  Jf  =  ( B,E,F,cin )  be  a  net  system  and  e,  e’eE.  We  say  that  e  and  e'  can  occur 
concurrently  at  a  case  c- denoted  c[{e,e'})~  iff  c[e>  and  c[e'>  and  ('eue’)n 
(Vuc'*)  =  0.  Thus,  e  and  e!  can  occur  concurrently  at  a  case  if  they  can  occur 
individually  and  their  “neighbourhoods”  are  disjoint. 

Similarly  we  can  define  the  notion  of  conflict.  Let  jV  be  a  net  system  as  above 
with  e,  e'eE.  e  and  e'  are  said  to  be  in  conflict  at  a  case  c  iff  c[e>  and  c[e'>  but  not 
c[{e,e'}).  Thus,  if  e  and  e'  are  in  conflict  at  c,  it  means  that  they  are  both  individually 
enabled  at  c,  but  they  cannot  occur  together  at  c.  For  the  computation  to  proceed, 
the  conflict  must  be  resolved  by  making  a  (nondeterministic)  choice  between  the  two 
events. 

The  definition  of  ->  ,  is  designed  to  ensure  that  the  notion  of  change  of  state  in 
an  elementary  net  system  is  fairly  restricted. 

First,  notice  that  an  event  must  cause  the  same  change  in  the  system  state  whenever 
it  occurs;  its  pre-conditions  cease  to  hold  and  its  post-conditions  begin  to  hold.  Thus, 
if  cy  Ac2  and  c3  Ac4  are  both  possible  in  a  net  system,  then  it  must  be  the  case  that 
cy  —  c2  =  c3  —  c4  —  'e  and  c2  —  cy  =  c4  —  c3  =  e‘. 

Further,  to  determine  whether  an  event  e  is  enabled  at  a  case  c,  it  is  sufficient  to 
look  at  the  conditions  contained  in  'e  and  e' .  e  is  enabled  at  c  iff  'e  ^  c  and  e'  n  c  =  0  -  no 
“side-conditions”  are  involved  in  the  enabling  of  an  event. 

Finally,  it  turns  out  that  the  transition  system  TS  ,  associated  with  a  net  system 
is  deterministic;  that  is,  cAc'  and  cAc"  implies  that  c'  =  c".  To  connect  up 
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with  other  approaches  to  the  theory  of  distributed  systems,  nondeterminism  can  be 
introduced  into  TS  V  by  labelling  the  events  in  E.  We  shall  come  back  to  this  point 
later  in  this  section. 

Let  us  consider  an  example  of  modelling  a  distributed  system  using  an  unlabelled 
elementary  net  system.  Consider  the  problem  of  sharing  resources  in  a  distributed 
system.  Suppose  that  there  are  two  processes  Pl  and  P2  in  the  system  which  require 
access  to  a  common  resource  r.  Suppose  that  r  can  be  used  by  only  one  process  at 
a  time  -  r  could,  for  instance,  be  a  printer.  Then,  when  one  of  the  processes  is  granted 
access  to  r,  the  other  process  should  be  prevented  from  accessing  r  till  the  first  process 
releases  it.  This  will  ensure  that  at  any  state  during  a  computation  of  the  system,  at 
most  one  process  can  actually  be  using  that  resource. 

Figure  5  models  a  solution  to  this  problem  of  mutual  exclusion.  In  this  net  system 
the  process  Ph  i  =  1,2,  is  represented  by  the  conditions  {bl0,b\,bl2,bl3}  and  the  events 
{el0,e\,e2,el3}.  Each  process  is  modelled  as  a  simple  loop  consisting  of  four 
events  -  getting  access  to  r  (e^),  utilizing  r  (e\),  releasing  r  ( e\ )  and  performing  some 
internal  computations  not  involving  r  (e'3).  At  the  initial  case,  both  processes  are 
waiting  for  access  to  r.  The  additional  condition  a  functions  as  an  arbitrator  which 
enforces  mutual  exclusion  of  access  to  r.  For  example,  suppose  that  e\  occurs  initially, 
giving  P2  access  to  r.  Since  a  ceases  to  hold  el  is  no  longer  enabled.  Thus,  P{  can 
gain  access  to  r  only  after  P2  releases  r  by  the  occurrence  of  e\.  It  is  easy  to  check 
that  b\  and  b\  can  never  hold  together  in  this  net  system.  On  the  other  hand,  the 
conditions  b\  and  b\  can  hold  at  the  same  case  -  that  is,  the  events  e\  and  e\  which 
do  not  involve  the  use  of  r  can  occur  concurrently  in  this  system. 

Finally,  we  show  that  we  can  describe  the  behaviour  of  elementary  net  systems 
in  terms  of  distributed  transition  systems.  Consider  an  elementary  net  system 
Jf  =  ( B ,  E,  F,  c,„).  The  transition  system  TS  V  contains  information  about  the  causality 
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and  conflict  present  in  Jf.  To  describe  the  concurrency  present  in  Jf,  it  is  sufficient 
to  augment  TSy  with  additional  transitions  labelled  by  concurrent  steps,  as  follows. 

We  first  extend  the  notion  of  a  pair  of  events  being  concurrently  enabled  at  a  case 
to  a  set  of  events.  Let  u  =  {e1,e2 be  a  finite  subset  of  E.  We  say  that  u  is 
concurrently  enabled  at  a  case  ceC,  -  denoted  c[u>- iff  c[c(>  for  each  e,eu  and, 
further,  c[{e1,e2})  for  every  pair  of  distinct  events  e1,e2eu. 

We  can  then  define  the  step  transition  relation  =>  r  as  follows, 

=>  ,  =  {(c,  u,  c')\c,c'eC  r,  c[u )  and  c  —  c’  =  ’u,  c'  —  c  =  u}. 

Here  ’u  and  u*  denote  the  unions  of  the  pre-conditions  and  post-conditions  of  the 
events  contained  in  u.  Note  that  ,  is  “included”  in  =>  in  the  sense  that  if 
(c,e,c')e-+  ,  then  (c,  {e},c')e=>  , We  can  then  immediately  establish  the  following. 

PROPOSITION  2.4. 

DTS  y  =  (C  t  ,  E,  =>  ,  )  is  a  distributed  transition  system  over  E. 

It  is  easy  to  verify  that  the  concurrency  and  choice  present  in  Jf  is  precisely  captured 
by  the  DTS  ( D  TS  *.). 

However,  notice  that  this  DTS  is  deterministic,  for  the  same  reason  that  the  transition 
system  TS  ,  is.  As  we  had  mentioned  earlier,  we  can  introduce  nondeterminism  by 
labelling  the  events. 

DEFINITION  2.5. 

A  L-labelled  elementary  net  system  is  a  pair  Jfz  =  (jV,  (p),  where  jV  —  ( B ,  E,  F,  cin)  is 
an  elementary  net  system,  called  the  underlying  net  system  of  JTZ.  Z  is  a  set  of  labels 
and  </>:£->  Z  is  the  labelling  function. 

The  notions  we  have  developed  for  net  systems  can  be  transported  to  labelled  net 
systems  in  the  obvious  way.  To  represent  the  behaviour  of  a  labelled  net  system 
as  a  DTS,  we  can  define  DTS  ,  to  be  the  DTS  over  Z  obtained  by  using  the  labelling 
function  0  to  rename  the  actions  in  D  TS  r,  the  DTS  over  E  generated  by  the  underlying 
net  system  Jf. 

However,  in  general  we  need  to  place  a  restriction  on  the  labelling  function  in 
order  to  get  a  neat  translation  from  labelled  net  systems  to  distributed  transition 
systems.  In  a  DTS,  we  have  restricted  concurrent  steps  to  be  sets  of  actions.  On  the 
other  hand,  a  labelled  net  system  may  generate  a  concurrent  step  in  D  TS  ^  where 
two  distinct  events  in  the  step  have  the  same  label.  To  avoid  dealing  with  multisets 
in  concurrent  steps  that  arise  in  this  fashion,  we  require  that  events  which  can  occur 
concurrently  in  the  underlying  net  system  Jf  have  distinct  labels. 

Let  Jfz  =  ( B ,  E,  F,  cin,  </>)  be  a  Z-labelled  net  system.  The  labelling  function  (p  is  said 
to  be  co-injective  if  it  satisfies  the  following  condition. 

Vel,e2eE:{lceCy.c[{el,e2}'))  implies  <f>(ei)  /  <p(e2). 

PROPOSITION  2.6. 

Let  Jfz  =  {Jf ,  (p)  be  a  L-labelled  elementary  net  system,  where  Jf  =  ( B ,  E,  F,  cin),  such 
that  (p  is  co-injective.  Then  DTSA  ^  ~(C  V,  Z,  =>jrJ  is  a  DTS  over  Z,  where 


yz=  {(C,(P{U),  Cj\(C.  U,  Cje=>  y}. 
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2.3  Event  structures 

To  reason  about  the  behaviour  of  a  distributed  transition  system  or  an  elementary 
net  system,  we  have  to  examine  all  the  computations  of  the  underlying  “machines” 
defined  by  the  model.  For  this,  it  is  convenient  to  work  with  an  abstract  representation 
of  the  entire  behaviour  of  the  system.  This  behavioural  description  should  include 
information  about  all  the  computations  of  the  system,  explicitly  identifying  the  causal 
dependencies  and  concurrency  present  within  each  computation.  In  addition,  it  should 
also  have  a  way  of  describing  the  branching  points  in  the  system  behaviour. 

Before  discussing  behavioural  representations  of  concurrent  systems,  let  us  first  go 
back  to  sequential  transition  systems.  A  computation  of  a  sequential  transition  system 
TS  =  (S,  £,  ->)  starting  at  some  state  s0eS  is  an  alternating  sequence  of  actions  and 
states  which  obeys  the  transition  relation  We  shall  restrict  our  attention  to  the 
maximal  computations  of  the  system  -  those  that  cannot  be  extended  by  performing 
any  more  actions.  Thus,  a  maximal  computation  is  a  finite  sequence  just  in  case  a 
state  is  reached  at  the  end  of  the  sequence  from  which  no  transition  is  possible; 
otherwise,  it  is  an  infinite  sequence. 

A  natural  way  to  group  together  the  sequences  which  correspond  to  computations 
of  TS  =  (S.  E,  ->)  starting  from  s0  is  in  the  form  of  a  tree.  The  nodes  of  the  tree  are 
labelled  by  states  from  S  and  the  edges  are  labelled  by  actions  from  E.  The  root  node 
is  labelled  by  the  initial  state  s0.  Each  maximal  path  in  the  tree  now  corresponds  to 
a  computation  of  the  system.  The  branching  points  in  the  tree  are  the  states  where 
the  system  makes  choices  between  different  possible  actions. 

In  the  case  of  models  exhibiting  concurrency,  the  situation  is  more  complicated. 
A  computation  of  such  a  system  is  a  partially  ordered  set  of  actions,  not  a  simple 
sequence,  so  we  need  a  more  sophisticated  method  of  collecting  all  the  computations 
together  in  a  single  structure.  An  elegant  way  of  achieving  this  is  to  use  event  structures. 
Event  structures  are  behavioural  models  of  distributed  systems  in  which  causality, 
concurrency  and  choice  (conflict)  are  represented  explicitly. 

Prime  event  structures,  introduced  in  Nielsen  et  al  (1980),  are  the  simplest  type  of 
event  structures.  They  have  a  rich  theory  and  are  closely  related  to  both  net  systems 
and  domains.  Since  we  deal  only  with  prime  event  structures  in  this  paper,  henceforth 
we  shall  simply  call  them  event  structures. 

DEFINITION  3.1 

An  event  structure  is  a  triple  ES  =  (E,  ^ ,  #)  where 

(1)  £  is  a  set  of  event  occurrences. 

(2)  ^  ^  £  x  £  is  a  partial  order  called  the  causality  relation. 

(3)  #  £  £  x  £  is  an  irreflexive  and  symmetric  conflict  relation. 

(4)  #  is  inherited  via  ^  in  the  sense  that  e1#e2^e3  implies  that  e^e^  for  every 
eue2,e3  in  £. 

An  element  of  £  represents  the  occurrence  of  an  event  within  a  specific  context.  Thus, 
if  the  same  event  can  occur  in  different  contexts,  “copies”  of  it  will  be  present  in  the 
event  structure.  This  is  why  we  have  called  the  elements  of  £  event  occurrences  rather 
than  events. 

If  e1^e2,  then  e2  is  causally  dependent  on  ex.  Thus,  in  any  computation  of  the 
system,  e2  can  occur  only  if  el  has  already  occurred.  As  usual  we  let  ^  stand  for  ^  _1. 
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The  #  relation  identifies  pairs  of  events  which  are  inconsistent  with  each  other  and 
hence  cannot  both  occur  during  the  same  computation.  The  last  clause  of 
definition  3.1  ensures  that  if  ex#e2  then  events  that  are  causally  dependent  on  ex  are 
in  conflict  with  events  that  are  causally  dependent  on  e2  -  in  other  words,  the 
inconsistency  of  ex  and  e2  is  inherited  by  events  that  follow  these  two  events. 

Two  events  that  are  neither  causally  related  nor  in  conflict  with  each  other  can 
both  occur  within  a  computation  with  no  order  over  their  occurrence.  We  can  thus 
define  the  concurrency  relation  co  in  an  event  structure  ES  =  (£,  ?$,#)  in  terms  of  < 
and  #  as  follows. 


co  —  ExE  —  {^u^  u  #). 


Notice  that  co,  like  #,  is  irreflexive  and  symmetric.  Clearly,  every  pair  of  distinct  events 
in  an  event  structure  belongs  to  exactly  one  of  the  four  relations  {^,  ^,#, co). 

It  is  useful  to  define  one  more  auxiliary  relation.  Let  ES  —  (E,  ^,#)  be  an  event 
structure  and  e,e'eE.  Then 


eft^e'  =  e#e'  and  Ve1,e'1eE:[el  ^  e  and  e\  ^  e'  and 

ex#e' i  implies  el  =  e  and  e\  =  e']. 


#(I  identifies  the  minimal  elements  (under  of  the  #  relation  and  is  hence  called  the 
minimal  conflict  relation.  #;i  identifies  the  actual  branching  points  in  the  behaviour 
where  choices  are  made  between  conflicting  events.  This  “basic”  conflict  then 
propagates  to  causally  related  events  and  “generates”  other  conflicts. 

Figure  6  is  an  example  of  an  event  structure.  The  squiggly  lines  represent  the 
relation.  The  causality  relation  is  shown  in  the  form  of  the  associated  Hasse  diagram. 
The  #  relation  is  then  uniquely  determined  by  the  last  part  of  definition  3.1.  In  this 
event  structure,  because  e1#ue2  ^  e6.  It  is  also  easy  to  see  that  e6coe-,. 

The  states  of  an  event  structure  are  called  configurations.  A  configuration  identifies 
a  set  of  events  that  have  occurred  “so  far”.  An  event  can  occur  only  if  all  the  events 
in  its  past  have  occurred.  Two  events  that  are  in  conflict  can  never  both  occur  in  the 
same  stretch  of  behaviour.  Before  formalizing  these  notions  it  will  be  convenient  to 
adopt  the  following  notion. 


eg  □ 


□  e10 


Figure  6.  An  event  structure. 
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Let  ES  =  (£,  ^,#)  be  an  event  structure  and  X  ^  E.  Then  [X  =  {e'\3eeX:e'  ^  e}. 
For  the  singleton  {e},  we  shall  write  { e  instead  of  j{e}. 

DEFINITION  3.2 

Let  ES  —  (E,^,  #)  be  an  event  structure  and  c  ^  £.  Then  c  is  a  configuration  iff 


(left-closed) 

(conflict-free) 


(1)  c=jc, 

(2)  (c  x  c)n#  =  0. 


For  the  event  structure  shown  in  figure  6,  {e2,es,e6}  is  a  configuration.  {e2,e5,e10j 
is  not  a  configuration  because  it  is  not  left-closed  and  {e3 ,  e7 ,  e8 }  is  not  a  configuration 
because  it  is  not  conflict-free. 

We  are  particularly  interested  in  a  restricted  subset  of  configurations  called  local 
configurations.  The  notion  of  a  local  configuration  is  based  on  a  simple  but  crucial 
observation  which  lies  at  the  heart  of  the  theory  of  event  structures  (Nielsen  et  al  1980). 

PROPOSITION  3.3. 

Let  ES  =  ( E ,  ;$,#)  be  an  event  structure  and  eeE.  Then  [ e  is  a  configuration. 

We  now  define  LCES  =  {je| eeE}  to  be  the  set  of  local  configurations  of  the  event 
structure  ES  =  (£,  ^ ,  #). 

We  do  so  because  a  (general)  configuration  c  £  E  can  be  viewed  as  a  global  state 
of  the  system.  Part  of  a  global  configuration  may  change  independent  of  each  other, 
due  to  the  spatial  separation  and  the  partial  autonomy  of  the  individual  agents  in 
the  system  being  modelled  by  the  event  structure.  A  finite  global  configuration  c.  is 
completely  characterized  by  specifying  the  maximal  events  (with  respect  to  ^)  which 
belong  to  c.  Each  local  configuration  [e  corresponding  to  a  maximal  event  eec  can 
be  regarded  as  a  local  state  which  contributes  to  the  global  state  at  c. 

When  we  reason  about  the  behaviour  of  an  event  structure,  we  would  like  to  make 
assertions  about  properties  that  are  satisfied  by  the  global  configurations  -  that  is, 
properties  that  hold  at  the  global  states  of  the  system.  However,  a  global  state  can 
be  completely  described  in  terms  of  all  the  local  states  that  are  part  of  that  global  state. 
Thus,  we  shall  restrict  ourselves  to  specifying  properties  at  the  local  configurations. 
Using  combinations  of  these  assertions,  we  can  describe  global  configurations  of  the 
event  structure.  Further,  the  assertions  that  we  can  make  about  a  global  configuration 
are  tied  down  to  the  assertions  that  we  can  make  about  the  local  configurations  that 
constitute  the  global  configuration.  This  will  become  clearer  in  the  second  part  of 
the  paper  where  we  discuss  how  to  specify  properties  of  distributed  systems. 

As  we  had  mentioned  at  the  beginning  of  this  section,  an  event  structure  is  a  single 
entity  which  describes  all  the  computations  of  a  distributed  system.  Thus,  we  need  a 
means  of  “extracting”  individual  computations  from  an  event  structure.  Since  a 
configuration  represents  a  set  of  events  that  have  happened  so  far,  in  general,  an 
arbitrary  configuration  represents  a  partial  computation  of  the  system.  If  we  consider 
configurations  which  are  maximal  (with  respect  to  inclusion)  we  obtain  the  maximal 
computations  of  the  event  structure.  We  call  these  the  runs  of  the  event  structure.  It 
is  easy  to  verify  the  following  characterization  of  runs.  Let  r  <=  E.  Then  r  is  a  run  iff 

VeeE:  eer  iff  Ve'eE:  e#e’  implies  e'^r. 

Next,  let  us  look  at  some  useful  restrictions  on  event  structures.  We  begin  with  the 
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auxiliary  relation  #M.  In  general,  there  may  be  events  in  #  whose  inconsistency  cannot 
be  traced  back  to  a  pair  of  events  in  -  a  typical  example  consists  of  two  infinite 
descending  chains  of  events  in  #  with  each  other.  We  would  like  to  rule  out  such 
structures,  since  they  model  behaviours  which  are  intuitively  infeasible.  We  can 
therefore  restrict  our  attention  to  well  branching  event  structures. 

DEFINITION  3.4 

Let  ES  =  (£,  ;$,#)  be  an  event  structure.  ES  is  well-branching  iff 

Ve,e'eE:e#e'  implies  Be,  ,e\eE:e1  ^  e  and  e\  ^e'  and  e,#^e\. 

Well-branching  is  a  fairly  weak  restriction.  A  stronger  and  more  useful  restriction  is 
that  of  finitariness.  An  event  structure  ES  =  (£,  ^,#)  is  said  to  be  Unitary  in  case  [e 
is  a  finite  set  for  every  eeE.  Finitariness  captures  the  important  fact  that  in  any 
realizable  system,  an  event  can  be  causally  dependent  on  only  a  finite  set  of  events. 
An  event  with  an  infinite  past  can  never  actually  occur. 

There  is  a  systematic  way  of  describing  the  behaviour  of  elementary  net  systems 
using  Unitary  event  structures.  To  do  this,  we  require  labelled  event  structures.  A 
labelled  event  structure  is  a  pair  ESz  —  (ES,(p)  where  £S  —  (£,  ^,#)  is  an  event 
structure  and  </>:£-►£  is  a  labelling  function. 

Constructing  a  labelled  Unitary  event  structure  describing  the  behaviour  of  a  net 
system  involves  an  intermediate  stage  where  the  net  system  is  “unfolded”  to  generate 
an  acyclic  structure.  The  details  are  a  bit  involved  and  can  be  found  in  Nielsen  et  al 
(1980)  and  Thiagarajan  (1990).  We  shall  merely  present  an  example. 

Consider  the  elementary  net  system  in  figure  5  modelling  mutual  exclusion.  The 
labelled  event  structure  in  figure  7  describes  the  behaviour  of  this  system.  In  this  case, 
the  event  occurrences  in  the  event  structure  are  labelled  by  the  events  of  the  net  system. 

Given  a  Unitary  event  structure  ES,  we  can  construct  a  DTS  DTSes  which  exhibits 
the  same  behaviour  as  ES.  Let  denote  the  set  of  finite  configurations  of  the 


Figure  7.  A  labelled  event  structure. 


Models  and  logics  for  true  concurrency 


145 


finitary  event  structure  ES  =  {E,  ^ ,  #).  We  can  define  the  step  transition  relation 

-+ES  £  ^ES  X  ^fin(E)  X  %fEs  »S  follows: 

~*es  —  {{c,u,c')\cnu  =  (/)  and  cuu  —  c'  and 

Vel,e2eu:el  #e2  implies  e1  co  e2). 

PROPOSITION  3.5. 

D  TSEs  =  Wes,  E,  -»£S)  is  a  DTS  over  E. 

As  in  the  case  of  elementary  net  systems,  it  turns  out  that  DTSes  is  always 
deterministic.  Once  again,  we  can  use  labelled  event  structures  to  permit  non¬ 
determinism  in  this  DTS.  As  before,  we  have  to  restrict  the  labelling  to  be  co-injective 
to  rule  out  multisets  in  concurrent  steps.  In  other  words,  given  ESZ  =  (E,  ^, #,</>),  we 
require  that  for  every  e1,e2eE:  e1coe2  implies  fie^  (f)(e2).  We  then  have  the 
following  result. 

PROPOSITION  3.6. 

Let  ESe  =  (ES,  <p)  be  a  'L-labelled  event  structure  where  </>  is  a  conjective  labelling 
function.  Then  D TSes^  —  ,  X,  =>ESJ  is  a  DTS  over  X  where 

^esz  =  { (c>  ( « )>  c')  |  (c,  u,  c')  e  } . 

2.4  Communicating  sequential  agents 

In  an  event  structure,  the  entire  behaviour  of  a  distributed  system  is  specified  as  a 
single  entity.  Individual  computations  of  the  system  can  be  identified  using  the  notion 
of  a  run.  However,  no  further  information  is  provided  about  the  structure  of  the  system. 

Consider  a  distributed  system  consisting  of  a  finite  set  of  sequential  agents 
performing  a  joint  task,  using  communication  to  coordinate  their  activities.  When 
reasoning  about  the  behaviour  of  such  a  system,  it  is  convenient  to  associate  the 
events  occurring  in  the  system  with  the  agents  involved  in  the  events.  This  can  be 
captured  by  restricting  event  structures  to  a  model  called  communicating  sequential 
agents  (CSA). 

Let  N  denote  the  set  of  natural  numbers  {1,2,3...}.  We  shall  use  elements  of  N 
as  names  for  the  agents  in  our  system. 

DEFINITION  4.1 

A  system  of  communicating  sequential  agents  (CSA)  is  a  triple  CSA  —  (E,  ^,q),  where 

(1)  £  is  a  non-empty  set  of  event  occurrences; 

(2)  ^  is  a  partial  order  on  E  called  the  causality  relation; 

(3)  !/:£-*•  fifin( N)  is  a  naming  function  assigning  to  each  e  in  E  a  non-empty  finite 
subset  of  N; 

(4)  Let  Ej  =  {e\ eeE  and  jerj(e)}.  Then,  for  every  e  in  E: 

Y/eN:  jenEj  is  totally  ordered  by 

We  interpret  jeq(e)  as  the  agent  j  participating  in  the  event  e.  Thus  q(e)  —  {1,2}  can 
stand  for  a  synchronization  “handshake"  between  agents  1  and  2. 
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The  poset  (£,-,  ^  j ),  where  is  ^  restricted  to  Ej  x  Ejt  represents  the  local 
behaviour  of  agent  j  in  CSA.  Usually,  we  say  “agent  /’  to  denote  this  poset. 

As  in  an  event  structure,  if  e t  ^  e2  then  e2  causally  depends  on  ex  \  in  no  run  of 
CSA  can  e2  occur  without  ex  having  occurred  earlier. 

To  separate  concurrency  from  conflict,  both  the  causality  relation  ^  and  the  naming 
function  17  are  used.  In  a  CSA,  each  agent  is  defined  to  be  sequential.  Thus,  given  any 
two  events  e  and  e'  which  both  involve  the  same  agent  -  that  is  17(e)  and  r\{e')  are  not 
disjoint  -  e  and  e'  must  either  be  causally  related  or  in  conflict.  So  if  e  and  e '  are 
incomparable  with  respect  to  ^  and  i7(e)ni7(e') 0,  then  e  and  e'  are  in  conflict. 

The  motivation  for  the  last  condition  in  definition  4.1  should  now  be  clear:  we  do 
not  wish  an  event  occurrence  to  causally  depend  upon  conflicting  event  occurrences. 
This  condition  also  implicitly  ensures  that  the  basic  conflict  in  the  system  is  generated 
within  agents  -  in  effect,  choices  are  made  locally  by  individual  agents  and  then 
propagated  across  agents  via 

On  the  other  hand,  if  two  events  e  and  e'  are  unordered  and  their  combined  past 
does  not  contain  any  conflicting  events  they  must  be  concurrent.  Since  choices  are 
assumed  to  be  made  locally,  it  is  sufficient  to  check  that  for  each  agent the  combined 
past  of  e  and  e'  does  not  have  incomparable  events  involving  j.  In  other  words,  if 
(jeuj ,e')nEj  is  totally  ordered  by  ^  for  every  j,  then  the  two  events  e  and  e'  are 
concurrent. 

If  eeEj,  the  local  state  [e  includes  the  local  history  of  agent  j  as  well  as  the  “latest” 
local  histories  of  all  other  agents  with  which  j  has  communicated  upto  this  state.  Let 
LCcsa  =  {le\eeE}  be  the  set  of  local  states  of  CSA. 

By  suitably  restricting  the  naming  function  17,  we  can  capture  interesting  subclasses 
of  communicating  sequential  agents. 

The  first  restriction  is  on  the  number  of  agents.  In  a  general  CSA,  we  may  have  an 
unbounded  number  of  agents  in  the  system.  By  restricting  the  range  of  17  to  a  finite 
subset  {1,2 ,...,«}  of  N,  we  obtain  CSA  which  may  have  upto  n  agents,  which  we  call 
n-CSA. 

As  we  had  mentioned  earlier,  if  17(e)  is  not  a  singleton,  the  interpretation  is  that 
the  event  e  is  performed  jointly  by  the  agents  mentioned  by  17(e).  This  intuitively 
corresponds  to  “handshaking”  or  synchronous  communication  between  agents.  By 
restricting  y\  so  that  1 17(c)!  =  1  for  every  e  in  E,  we  effectively  rule  out  this  type  of 
synchronous  communication.  Instead,  in  such  an  asynchronous  CSA,  the  agents 
communicate  by  sending  messages  to  each  other.  The  sending  and  receiving  of  a 
message  are  regarded  as  two  distinct  actions,  each  involving  only  one  agent  at  a  time. 


Figure  8.  An  asynchronous  CSA. 
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Finally,  we  say  that  a  CSA  is  finitary  is  case  [e  is  a  finite  set  for  every  e  in  E.  The 
motivation  for  defining  finitary  CSA  is  the  same  as  the  motivation  for  defining  finitary 
event  structures  -  any  computation  of  a  real  system  can  be  traced  back  to  some 
starting  point,  so  the  past  of  any  event  occurring  during  the  computation  must  be  finite. 

Figure  8  is  an  example  of  an  asynchronous  CSA  consisting  of  two  agents,  a  producer 
and  a  consumer,  communicating  via  an  unbounded  buffer.  The  producer  can  produce 
zero  or  more  items  and  then  quit.  The  consumer  can  consume  items  produced  by 
the  producer  as  long  as  the  items  are  available  in  the  buffer.  The  events  in  the  CSA 
are  labelled  p,  q  and  c  to  denote  these  three  types  of  actions. 


3.  Logics  for  concurrency 

We  now  turn  our  attention  to  the  problem  of  reasoning  about  the  behaviour  of 
distributed  systems. 

A  specification  language  is  simply  a  formalism  in  which  one  specifies  behaviours 
of  systems  under  study.  Thus,  a  specification  language  for  distributed  systems  is  one 
in  which  we  can  describe  behavioural  properties  of  distributed  systems. 

The  specification  language  should  permit  us  to  combine  simple  specifications 
together  to  construct  more  complex  specifications,  reflecting  the  intuition  that  large 
systems  can  be  broken  down  into  more  manageable  subsystems.  This  calls  for 
disjunctive  and  conjunctive  abilities  in  the  language. 

In  addition,  since  we  are  dealing  with  distributed  systems  we  expect  to  describe 
properties  like  causality,  choice  and  concurrency.  For  this,  we  will  need  to  be  able 
to  specify  the  relationships  that  hold  between  system  states  as  the  computation 
proceeds. 

Our  requirements  suggest  the  use  of  a  formal  logic  with  boolean  connectives  and 
temporal  modalities  as  our  specification  language.  Temporal  logic  is  a  branch  of 
modal  logic  which  is  used  to  study  structures  of  states  varying  with  time.  We  will 
design  a  variety  of  modal  logics  which  are  extensions  of  temporal  logic  to  deal  with 
the  models  of  distributed  systems  developed  in  §  2. 

We  begin  with  a  quick  sketch  of  classical  propositional  modal  logic.  We  assume 
the  existence  of  a  countable  set  of  atomic  propositions  {p0,  , . . . }.  The  well-formed 
formulas  of  our  logic  if 0  are  defined  inductively: 

•  every  peSP  is  a  formula  of  if0; 

•  if  a  and  f  are  formulas  of  if0,  then  so  are  — i  a,  a  V  p  and  Oa. 

(~ i  a  is  to  be  read  as  “not  a”,  a  V  ft  is  to  be  read  as  “a  or  ft”,  and  O  a  is  to  be  read 
as  “diamond  a”).  The  intended  meaning  of  Oa  is  “a  becomes  true  eventually”. 

Formulas  are  to  be  interpreted  over  frames.  In  our  set-up,  a  frame  is  a  transition 
system  TS  =  (S,  X,  -*■).  A  model  M  is  a  frame  with  a  valuation  function;  i.e.  M  =  ( TS,  V), 
where  TS  =  (S,X, -►)  is  a  transition  system  and  V \S fiifP).  For  example,  if 
F(s)  =  {p1,p3},  we  interpret  this  to  mean  that  propositions  px  and  p3  are  true  at  state 
s  and,  further,  that  no  other  proposition  is  true  at  s. 

The  notion  of  a  formula  a  being  true  at  a  state  s  in  a  model  M  =  ( TS,  V )  where 
TS  =  {S,  X,  -+),  denoted  as  M,  s  t=a,  is  defined  inductively  as  follows: 

(i)  M,  st=p  iff  pe  F(s),  for  pe3P\ 

(ii)  M,  s  1= — l  a  iff  M,  s  f  a; 

(the  notation  M,  s fa.  stands  for  “it  is  not  the  case  that  M,  st=a”); 
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(iii)  M,  sNa  V  /?  iff  M,  sNa  or  M,  s^f. 

(iv)  M,  sN  Oa  iff  3 s'e&(s):  M,  s'Na; 

(recall  (hat  .^(s)  is  the  set  of  states  reachable  from  s  via->). 

M,  sNa  can  be  interpreted  as  the  assertion  that  the  model  M  at  state  s  is  an 
implementation  of  the  specification  a.  We  say  a  is  satisfiable  if  there  exists  a  model 
M  =  ( TS,  V),  where  TS  —  ( S ,  L,  ->),  and  there  exists  a  state  seS  such  that  M,  s  1=  a.  We 
say  that  a  is  M-valid  if  M,st=a  for  every  seS.  We  say  that  a  is  valid  -  and  denote 
this  by  t=  a  -  if  a  is  M-valid  for  every  model  M.  It  is  easy  to  see  that  a  is  valid  iff  — i  a 
is  not  satisfiable. 

The  following  derived  formulas  are  useful. 

def 

a  A  /?  =  — i  (— i  a  V  — i  P),  the  conjunction  of  a  and 
a  =>  p  d=  — i  a  V  /?,  a  implies  ft, 

a  =  P  —  (a  3  p)  A  3  a),  logical  equivalence  of  a  and  /?, 

□  a  d=  — i  (O  — i  a),  “Henceforth”  a, 

True  =  p0  V  ~ i  p0, 

False  =  “i 

It  can  easily  be  verified  that  for  any  model  M  —  ((S,  -+),  V)  and  seS, 

M,sN  Da  iff  Vs'e^(s):M, s' N a. 

A  number  of  interesting  properties  of  transition  systems  can  be  expressed  using  this 
logic.  Suppose  that  we  are  using  transition  systems  to  model  a  distributed  system 
consisting  of  n  processes  which  can  compete  for  a  shared  resource  r.  Let  the  atomic 
proposition  cf  stand  for  “  Process  i  has  access  to  the  resource  r”.  Then, 

□  A  (  cf  =>  A  — i  Cj 

ie{1.2,...,«}  \  i#; 

expresses  a  so-called  safety  property.  It  says  that  at  any  system  state,  at  most  one 
process  has  control  of  the  shared  resource  r.  This  will  ensure,  for  instance,  that  in 
case  r  is  a  shared  piece  of  data  then  the  sequence  of  values  assumed  by  r  during  the 
history  of  the  system  will  be  well-defined.  Broadly  speaking,  safety  properties  assert 
that  “bad”  situations  never  arise  in  the  system. 

Similarly,  if  we  let  the  proposition  rqt  stand  for  “Process  i  requires  access  to  resource 
r”,  the  formula, 

□  A  (rqt  =3  O  c,X 

is{l,2,...,n} 

expresses  a  liveness  property.  It  says  that  any  request  made  by  a  process  for  the  shared 
resource  is  eventually  granted  by  the  system.  In  general,  liveness  properties  specify 
that  something  “good”  occurs  eventually. 

This  logical  framework  is  very  simple,  but  for  that  reason  is  also  not  as  expressive 
as  we  would  wish.  In  particular,  we  would  like  to  devise  logics  to  reason  about  models 
with  true  concurrency.  In  the  rest  of  this  section,  we  shall  show  how  such  logics  can 
be  defined  for  the  formal  models  presented  in  §2. 
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3.1  Logic  for  distributed  transition  systems 

Recall  that  in  a  dts,  a  concurrent  step  consists  of  a  transition  labelled  by  a  finite  set 
of  actions.  This  leads  us  to  augment  the  simple  modal  logic  considered  earlier  with 
one  additional  modality,  <u>,  where  u  is  a  finite  subset  of  Z,  the  set  of  actions. 

Let  i fDTS  be  the  language  whose  well-formed  formulas  are  given  by: 

•  every  pe&  is  a  formula  of  DTS ; 

•  if  a  and  f  are  formulas  of  L£ DTS  then  so  are  — i  a,  a  V  /?,  O  a  and  <u)a,  where  u  is 
a  finite  subset  of  Z. 

Thus,  the  logic  JSf DTS  is  parametrized  by  Z.  To  emphasize  this,  we  will  write 
instead  of  L£ DTS . 

As  one  may  expect,  the  frames  for  our  logic  are  distributed  transition  systems  over 
Z.  A  model  is  a  pair  M  —  {DTS,  V),  where  DTS  =  (S, Z,  -*)  is  a  DTS  over  Z  and 
V:S-*/t(^)  is  the  valuation  function.  Given  seS,  the  notion  M,sl=a  is  defined  as 
before  for  the  atomic  propositions  and  for  the  connectives  ~ i  and  V  and  the  modality 
O.  For  the  new  modality  we  define: 

M, sN<u)a  iff  3s'eS:sAs'  and  M,s' hot. 

Relative  to  the  new  notion  of  models,  satisfiability  and  validity  are  defined  as  before. 
We  will  write  f=|rsa  to  denote  that  a  is  a  valid  formula  in  this  logic.  Let  SA  Tzdts 
denote  the  set  of  all  satisfiable  formulas  from  &\TS. 

Before  considering  an  example,  we  introduce  some  notational  conventions.  The 
derived  modality  [u]  is  defined  as: 

[u]a=f  <u>— i  a. 

where  u  is  a  singleton  {a},  we  will  write  <a)a  instead  of  <{u}> a.  For  the  empty  step, 
we  write  <0>a. 

Now  that  the  modalities  are  indexed  by  steps,  we  can  clearly  identify  the  branching 
points  in  a  transition  system.  For  example,  consider  the  transition  systems  shown  in 
figure  9.  In  the  first  system,  starting  at  s0  we  can  perform  a  and  then  choose  between 
b  and  c  whereas  in  the  second  system,  at  s'0  we  have  to  decide  right  away  whether 
we  are  going  to  execute  a  followed  by  b  or  a  followed  by  c.  The  first  situation  is 
captured  by  the  formula  <a)(<6>  True  A  <c>  True)  while  the  second  can  be  expressed 
as  <«)(<&)  True  A  [c]  False)  A  <a)(<c)  True  A  [fr]  False). 

In  this  logic,  we  can  distinguish  between  interleavings  and  true  concurrency.  For 
instance,  the  formula  (a}(b}True  A  (by^afTrue  /\  [{a,  h}]  False  is  satisfiable.  At 
the  state  where  this  formula  is  true,  both  the  interleavings  ab  and  ba  can  occur,  but 


Figure  9.  Varieties  of  branching  in  transition  systems. 
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the  corresponding  concurrent  step  {a,  b}  is  not  enabled.  On  the  other  hand,  it  is  easy 
to  see  that  the  formula  <  {a,  b} )  a  =>  <a>  <fr>  a  is  a  valid  formula,  because  the  definition 
of  a  DTS  guarantees  the  existence  of  a  function  /  associated  with  each  step,  breaking 
it  up  into  substeps. 

Returning  briefly  to  the  sytem  of  n  processes  considered  earlier,  assume  that  the 
shared  resource  r  represents  a  data  item  in  a  shared  block  of  memory.  Let  udt  denote 
the  act  of  process  i  updating  the  value  of  r.  Then,  the  specification 

□  A  [{udh  udj}]  False, 

i*j 

requires  that  the  memory  manager  never  permit  two  distinct  processes  to  concurrently 
update  r. 

Let  us  consider  another  example.  The  writing  of  a  paper  can  be  seen  as  a  sequential 
activity:  work  out  what  you  want  to  say,  write  it  out,  get  it  typed.  In  the  case  of  a 
joint  paper,  the  work  may  be  divided  up  in  terms  of  sections.  One  policy  the  authors 
may  follow  is  to  work  out  all  the  sections  before  preparing  a  typescript,  with  meetings 
for  discussion  and  correction  in  between.  That  is,  the  authors  satisfy 

<  WK}(worked  A  <  WR}(written  A  <  TY}  typed)), 

where 

WK  —  (work  out  §1,  work  out  §2,  work  out  §3}, 

WR  =  (write  §1,  write  §2,  write  §3), 

TY  =  (type  §1,  type  §2,  type  §3}, 

and  worked ,  written  and  typed  are  atomic  propositions  indicating  the  end  of  the 
working  out,  written  and  typing  steps  respectively.  Here  we  have  assumed  that  there 
are  three  authors  each  of  whom  is  responsible  for  one  section. 

The  concurrent  steps  are  necessary,  since  they  express  the  fact  that  this  is  a  joint 
paper;  if  the  interleaving  of  the  actions  required  for  the  three  sections  were  present, 
we  could  not  rule  out  the  possibility  that  the  three  authors  were  separately  writing 
three  (single-section)  papers. 

The  states  we  are  using  are  global  states.  The  person  working  out  §2  may  refer  to 
a  lemma  in  §1;  the  person  doing  the  word  processing  for  §1  may  use  the  macros 
defined  in  §3. 

It  becomes  necessary  to  use  sequentializations  when  a  complete  record  of  the  writing 
of  the  paper  is  required.  For  example,  a  mistake  pointed  out  by  the  referee  in  §2  may 
be  traced  to  the  lemma  in  §1,  which  may  be  just  a  case  of  wrong  typing  thanks  to  a 
misapplication  of  the  macro  from  §3. 

This  sort  of  mixture  of  independent  actions  and  synchronization  is  well  described 
in  a  DTS  framework. 

We  now  turn  to  the  formal  theory  of  the  language  £fzDTS-  Typical  questions  one 
asks  of  such  a  logic  include: 

»  Is  the  set  of  valid  formulas  axiomatizable? 

9  Is  the  satisfiability  problem  decidable? 

The  answers  to  these  questions  provide  a  good  deal  of  insight  into  the  strengths  and 
weaknesses  of  the  logic  and,  most  importantly,  into  the  expressive  power  of  the  logic. 

It  turns  out  that  both  these  questions  have  positive  answers  for  J2?f,rs.  Consider 
the  following  logical  system  ND. 
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The  system  ND 
AXIOM  SCHEMES 


(AO)  All  the  substitutional  instances  of  the  tautologies  of  Propositional  Calculus. 
(Al)  (a)  □  (a  =>/?)=: '(□mDf)  (Deductive  Closure) 

(b)  [u]  (a  id  ff)  =>  ( [u]  a  =>  [n]  p) 

(A2)  DaDfuJaA  □  □  a  (Reachability) 

(A3)  a  =  <0)a  (Empty  Step) 

(A4 ,k)  (for  k^l)  (Step  Axiom) 

<n>aA  A  [t>]  V  #,=>  V  A  <v1}(yVli\  A  (v2-vx  )yV2j 

ten  i=l  feF{u,k)vtzu  \  B,Et2E  u  / 

where  F(u,k )  is  the  set  of  all  functions  { f\f:  fi(u) -*  {1,2, ...,  k}}  and 
f A  a,  if  v  —  u, 

^  l/T>,  if  v  cl  u. 

INFERENCE  RULES 


(MP) 


a,  a  =j  P 
~1~ 


(TG) 


a 

□  a' 


Axioms  AO  to  A2  and  the  rules  MP  and  TG  are  standard.  The  characteristic  axioms 
of  distributed  transition  systems  are  A3  and  A4,k.  A3  captures  the  fact  that  the  empty 
step  cannot  change  the  state  of  the  system.  A4,k  is  actually  an  infinite  set  of  axioms, 
finitely  presented.  The  complicated  formulation  of  A 4 ,k  is  necessary  to  describe  the 
fact  that  each  concurrent  step  u  in  a  DTS  can  be  broken  up  into  concurrent  substeps 
which  are  specified  by  the  associated  function  /:/(«)->  S. 

A  formula  a  is  called  a  thesis  of  the  system  ND  -  denoted  hNDoc  -  iff  a  can  be 
derived  in  a  finite  number  of  steps  using  the  axioms  and  inference  rules  of  ND. 


Theorem  1.1.  (1)  ND  is  a  sound  and  complete  axiomatization  of  the  valid  formulas  in 
ST j,Ts ■  In  other  words,  \~ND(x  iff  N|rsa  for  every  aeJ?|TS. 

(2)  The  satisfiability  problem  for  this  logic  (i.e.  the  membership  problem  for  SATxdts) 
is  decidable  in  nondeterministic  exponential  time. 

It  turns  out  that  combining  concurrency,  captured  by  the  step  notion,  with 
determinacy  leads  to  a  very  expressive  class  of  models.  The  frame  TS  =  (S,  E,  -*■)  is 
said  to  be  deterministic  if  for  every  seS  and  every  ueffin(L)  there  exists  at  most  one 
s'eS  such  that  sAs'.  A  model  is  deterministic  if  its  underlying  frame  is. 

The  formula  a  is  said  to  be  deterministically  satisfiable  if  there  exists  a  deterministic 
model  for  a.  Similarly,  a  is  said  to  be  deterministically  valid  if  a  is  valid  over  the  class 
of  deterministic  models.  Let  t=oeIa  denote  that  a  is  deterministically  valid  and  let 
DSA  T^ts  denote  the  set  of  deterministically  satisfiable  formulas  in  S£\ts. 

It  turns  out  that  the  deterministically  valid  formulas  in  STj)TS  are  axiomatizable. 
Thanks  to  determinacy,  one  obtains  a  much  simpler  axiomatization  than  for  the 
general  case.  Let  D  denote  the  logical  system  obtained  from  ND  by  dropping  the 
infinitary  set  of  axioms  A4,  k(k  ^  1)  and  adding  two  new  axioms: 

(A5)  <u)a  =3  <u)<u  —  u)a,  (v^u),  (Weak  Step  Axiom) 

(A6)  <M>a=3[u]a.  (Determinacy) 

Let  h0a  denote  that  a  is  derivable  in  D. 
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Theorem  1.2.  (1)  D  is  a  sound  and  complete  axiomatization  of  the  deterministically 
valid  formulas  in  T£\ lTS .  In  other  words ,  \~n(x  iff  for  every  ocei^rs- 
(2)  The  membership  problem  for  DSA  T\ts  is  undecidable. 

The  surprise  here  is  that  determinacy  adds  a  sufficient  amount  of  expressive  power 
to  make  the  satisfiability  problem  undecidable.  By  combining  concurrent  steps  in  a 
deterministic  fashion,  it  turns  out  that  we  can  encode  the  two-dimensional  grid  of 
natural  numbers  N  x  N.  We  can  then  use  this  encoding  to  reduce  some  undecidable 
tiling  problems  described  by  Wang  (1961)  and  Harel  (1985)  to  the  problem  of 
deterministic  satisfiability  in  our  logic.  This  negative  result  w’as  shown  by  Parikh 
(1989,  pp.  199-209). 

A  variety  of  positive  and  negative  results  can  be  obtained  in  this  logical  framework 
by  studying  the  effect  of  placing  suitable  restrictions  on  the  DTS.  For  instance,  we 
can  restrict  the  set  of  actions  £  to  be  finite.  Alternatively,  we  can  demand  that  the 
DTS  as  a  whole  be  finite  -  that  is,  the  set  of  states  and  the  set  of  transitions  both  be 
finite.  We  can  also  incorporate  ideas  from  trace  theory,  arising  out  of  the  work  of 
Mazurkiewicz  (1989,  pp.  285-363),  and  define  trace  transition  systems,  which  permit 
both  local  and  global  specifications  of  concurrency.  Finally,  we  can  also  study  a 
smooth  generalization  of  Propositional  Dynamic  Logic  (Harel  1984,  pp.  497-604) 
obtained  by  extending  the  notion  of  a  regular  program  to  permit  concurrent  steps 
as  atomic  actions.  The  details  can  be  found  in  a  forthcoming  paper  (Lodaya  et  al  1991 ). 

The  logical  language  can  also  be  interpreted  over  Z-labelled  elementary  net 
systems  and  ^-labelled  event  structures,  where  the  labelling  function  is  co-injective. 
The  frames  that  we  use  are  the  corresponding  DTS,  as  defined  in  §2.  Thus,  a  Z-labelled 
elementary  net  system  JTz  —  (JT,<f)),  where  ~(B,E,F,cin),  gives  rise  to  a  model 
(. D  TS.rx,  V),  where  V:C  Similarly,  a  ^-labelled  event  structure  ESZ  =  ( ES ,  </>), 

where  ES  =  (E,  <:,#),  defines  a  model  (DTSes  ,  V\  where  V: -» 

Let  S/1  Tl-  and  SA  T |s  denote  the  set  of  formulas  from  satisfiable  in  models 
generated  by  Z-labelled  elementary  net  systems  and  ^-labelled  event  structures 
respectively. 

Theorem  1.3.  SA  T|rs  =  SAT*V  =  SA  T|s. 

In  other  words,  this  logic  cannot  discriminate  between  these  classes  of  models. 

3.2  Logic  for  event  structures 

We  now  turn  from  attributed  transition  systems  to  event  structures  as  frames  for 
our  logic.  In  the  logic  for  DTS,  we  used  the  global  state  approach  to  reasoning  about 
the  behaviour  of  the  system.  In  this  approach,  assertions  are  made  by  a  “global” 
observer  of  the  system  who  can  “see”  the  distributed  system  in  its  entirety  in  any 
given  state.  This  is  appropriate  for  DTS  since  the  states  of  a  DTS  do  in  fact  correspond 
to  the  global  states  of  the  system  being  modelled. 

Alternatively,  we  can  reason  about  the  system  from  the  point  of  view  of  the  local 
states  of  the  system.  Here,  assertions  are  made  by  individual  agents  in  the  system 
and  hence  the  nature  of  the  assertion  is  determined  by  the  “visibility”  of  the  system 
state  from  that  agent’s  point  of  view.  This  approach  is  more  suitable  for  reasoning 
based  on  event  structures,  where  we  can  use  a  local  configuration  [e  to  represent  the 
local  state  of  the  system  at  the  point  where  the  event  e  has  just  occurred. 

Another  feature  of  the  DTS  logic  is  that  concurrency  is  described  by  explicitly 
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specifying  the  actions  which  are  to  be  performed  concurrently  and  describing  the  effect 
of  such  actions.  This  approach  is  natural  for  the  DTS  because  the  models  themselves 
are  action-based.  On  the  other  hand,  in  an  event  structure  it  is  more  convenient  to 
specify  concurrency  in  an  abstract  manner  by  simply  asserting  facts  about  concurrent 
events  without  specifying  which  actions  are  to  be  performed  concurrently. 

The  key  notions  in  the  theory  of  event  structures  are  those  of  causality,  conflict 
and  concurrency.  This  leads  us  to  extend  the  language  by  adding  modalities  to 
capture  these  notions.  It  turns  out  to  be  fruitful  to  split  up  causality  into  two  parts, 
allowing  us  to  specify  both  “past”  and  “future”  behaviour. 

The  logic  SEes  is  built  up  as  follows:  again  fix  0>  —  {p0,p  1?... },  a  countable  set  of 
atomic  propositions.  Then  the  well-formed  formulas  of  are  given  by: 

•  every  pe£P  is  a  formula  of  5£ ES; 

•  if  a  and  (3  are  formulas  of  ES ,  then  so  are  ~~i  a,  a  V  /?,  O  a,  <9>  a,  A  a  and  V  a. 

Here,  the  modalities  O  and  <8>  denote  the  future  and  past  respectively.  A  will 
be  used  to  describe  concurrency  and  V  will  be  used  to  capture  conflict. 

Frames  for  this  logic  are  event  structures,  or  rather  the  local  configurations  of  event 
structures.  More  precisely,  a  frame  is  a  pair  (ES,LCes),  where  ES  =  (£,  ^,#)  is  an 
event  structure  and  LCES  is  the  set  of  local  configurations  of  ES. 

A  model  is  a  pair  M  =  {{ES,  LCES),  V)  where  ES  is  a  frame  and  V:  LCES  -*■  ftif?)  is  a 
valuation  function.  If  psV{[e )  then  this  is  taken  to  mean  that  p  is  true  at  the  local 
state  [e  in  the  model  M. 

The  notion  of  a  formula  a  being  true  at  a  local  state  [e  in  the  model  M  = 
{{ES,  LCes),  V)  is  denoted  as  M,  jcNoc  and  is  defined  inductively  as  follows. 

(i)  M,  jcl=p,  iff  peV{\e),  for  pet?. 

(ii)  M,  ie¥— i  a,  iff  M,  je^a. 

(iii)  M,  [e  V  a  V  (3,  iff  M,  [e  ¥  a  or  M,  je  1=  (3. 

(iv)  M,  [e  ¥  O  a,  iff  3e':e  <  e'  and  M,  \e'  1=  a. 

(v)  M,  [e  V  <S>  a,  iff  3e' :  e!  <  e  and  M,  \e'  1=  a. 

(vi)  M,[e  1=  V  a,  iff  3e':e#e'  and  M,  jc'Na. 

(vii)  M,[e  1=  A  a,  iff  3e':ecoe'  and  M,  je'ha. 

Notice  that  we  have  defined  the  modalities  O  and  <$>  in  an  irreflexive  manner. 
This  is  necessary  for  the  axiomatization  which  follows. 

The  notions  of  satisfiability  and  validity  are  defined  as  usual.  ¥ES  a  will  denote  that 
a  is  a  valid  formula  in  E£es. 

The  derived  connectives  A,  =>,  =,  □  are  defined  as  before.  In  addition,  we  set 
H  a  i  <S>  — i  a,  V  ad=  ~ l  V  ~ i  a,  A  a=f  ~i  A  ~i  a. 

We  can  also  define  a  useful  derived  modality  as  follows: 

^«  =  «V  OaV  <S>aV  VaV  A  a. 

def 

AAx  is  to  be  read  as  “Somewhere  a”.  Its  dual  So.  i  \  a,  read  as  “Everywhere  a”, 
expands  as  follows: 

=fa  A  Da  A  Ba  A  Va  A  A  a. 

Thus  <fa  describes  a  property  invariant  over  the  entire  model. 
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Many  interesting  features  of  event  structures  can  be  expressed  in  this  logic.  Recall 
that  the  maximal  computations  of  event  structures  are  termed  runs.  We  can  use  an 
atomic  proposition  p  to  mark  out  a  run  with  the  formula  V— i  p.  For  any 

modal  M  =  ((ES,LCes),  V)  if  the  formula  p  =  V  — i  p  is  M-valid,  then  {e\M,  jefp} 
constitutes  a  run  of  ES.  Using  this  method  of  marking  out  runs,  we  can  express 
liveness  and  safety  properties  in  event  structures.  Let  a  represent  a  liveness  property. 
Then  p  A  a)  is  M-valid  for  a  model  M  just  in  case  every  computation  of  the 
underlying  event  structure  contains  a  local  state  where  a  is  true.  Similarly,  if  0 
represents  an  undesirable  situation,  the  formula  £{p  3—1  0)  expresses  the  safety 
property  that  0  does  occur  at  any  state  of  the  run  marked  by  p. 

In  a  similar  spirit  the  formula  y  =  □~i^A  S~i  y  can  be  used  to  capture  the 
notion  of  a  cut  -  a  maximal  set  of  pair-wise  incomparable  events.  Within  a 
computation,  a  cut  corresponds  to  a  global  state.  Thus  we  can  use  the  notion  of  a 
cut  in  conjunction  with  that  of  a  run  to  look  “sideways”  from  a  local  state  and  make 
assertions  about  the  current  global  state. 

The  formula  V  a  3  □  V  a  describes  the  fact  that  conflict  is  inherited  in  a  prime 
event  structure.  The  formula  A«30(AaV  Oa)  expresses  the  fact  that  the 
configurations  of  an  event  structure  are  “consistent”  by  asserting  that  the  unified  past 
of  any  pair  of  events  in  co  is  conflict-free. 

Due  to  lack  of  space,  we  will  not  provide  a  separate  detailed  example  for  this  logic. 
The  logic  presented  in  the  next  section,  called  d£CSA,  is  also  based  on  event  structures. 
We  shall  provide  a  detailed  example  for  that  logic.  It  will  not  be  difficult  to  see  how 
that  example  can  be  translated  into  the  present  framework. 

Consider  the  logical  system  E. 

The  system  E 
AXIOM  SCHEMES 


(AO)  All  the  substitutional  instances  of  the  tautologies  of 
(Al)  (i)  □  (a  3  0)  =>(□<*  3  H0) 

(ii)  □  (a  3  0)  id  (Ha  3  B0) 

(iii)  V  (a  3  0)  =>(Va  3  V0) 

(iv)  A  (a  3  0)  D(Aa  3  A0) 

(A2)  (i)  Ha  3  □  Ha 

(ii)  Ha  3  □  Ha 
(A3)  (i)  a  3  V  Va 
(ii)  a  3  A  A  a 
(A4)  (i)  a  3  □  <$>a 
(ii)  a  3  □  Oa 
(A5)  Va  3  □  Va 
(A6)  Aa3  0(OaV  Aa) 

(A7)  (i)  Oa  3  H(a  V  Oa  V  Oa  V  Va  V  A  a) 

(ii)  Va3  V(aV  OaV  <9>aV  VaV  A  a) 

(iii)  Aa3  A(aV  OaV  OaV  VaV  A  a) 

(iv)  Oa  3  0(a  V  Oa  V  Oa  V  Aa) 

(v)  Va  3  A(Oa  V  Va  VAa) 

(vi)  Aa  3  H(Oa  V  VaV  Aa) 


Propositional  Calculus. 
(Deductive  closure) 


(Transitivity  of  <) 

(Symmetry  of  #  and  co) 

(Relating  past  and  future) 

(Conflict  inheritance) 
(Conflict-free  past) 
(Relating  <,  #  and  co) 
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INFERENCE  RULES 


(MP) 


a 


(TG)(i) 


(UNIQ) 


a  =>  /? 
a 


□  a 
p  =>  a 


(ii) 


0a 


(iii) 


Aa 


(iv) 


Va 


where  p  is  an  atomic  proposition  not  appearing  in  a  and 


/>  =  pA[I]~pAE~pAA~pAV~p. 


Axioms  AO  to  A4  and  inference  rules  MP  and  TG  are  standard.  A5  expresses  the 
fact  that  conflict  is  inherited  via  ^ .  A6  ensures  that  any  two  events  related  by  co 
have  consistent  (i.e.  conflict-free)  pasts.  The  remaining  axioms  are  necessary  to  capture 
the  fact  that  the  relations  #  and  co  “cover”  the  event  structure  -  i.e.  any  two 

distinct  events  are  related  by  one  of  these  relations. 

The  rule  UNIQ  is  adapted  from  Burgess  (1980).  Given  a  proposition  p,  the  definition 
of  p  ensures  that  it  can  be  true  in  at  most  one  local  configuration.  Hence,  we  can 
label  each  local  configuration  \e  by  a  distinct  formula  pe.  The  rule  UNIQ  allows  us 
to  construct  this  labelling,  which  is  crucial  in  demonstrating  the  completeness  of  the 
axiomatization. 

Let  l-£  a  denote  that  a  is  a  thesis  of  the  system  E. 

Theorem  2.1.  E  is  a  sound  and  complete  axiomatization  of  the  valid  formulas  in  E£es. 
In  other  words,  f-£a  iff^ES  a- 

Recall  that  we  had  defined  an  auxiliary  relation  in  an  event  structure,  called  the 
minimal  conflict  relation.  We  can  define  a  modality  to  capture  the  relation  #u. 

It  is  possible  to  strengthen  i?£S  by  replacing  the  modality  V  by  the  modality  V4. 
Let  us  call  this  new  language  To  obtain  a  useful  comparison  with  E£es,  and  also 
to  obtain  an  axiomatization,  we  must  change  the  notion  of  a  frame.  For  this  language, 
we  define  a  frame  to  be  a  pair  (ES,  LCES)  where  ES  is  a  well-branching  event  structure. 
Recall  that  a  well-branching  event  structure  is  one  in  which  the  #  relation  can  be 
completely  specified  using  the  relations  and  As  usual,  a  model  is  a  frame 
together  with  a  valuation  function.  Models  based  on  well  branching  frames  are  called 
well  branching  models. 

The  semantics  of  is  the  same  as  that  of  ES  except  that  the  clause  for  V  is 
replaced  by: 


M,|et=  iff  3e':e#ue'  and  M,|e'N a. 

In  =^£S,  we  can  obtain  V  as  a  derived  modality: 

Va-V^aV  V^OaV  ❖  V„a  V  V^Oa. 

As  before,  V a  denotes  the  formula  n  Vna.  It  is  easy  to  verify  that  V a  can 
be  expressed  as  follows: 

Vad=  V^a  A  V^Da  A  SV^a  A  □  VMda. 
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In  a  well-branching  model,  the  derived  modalities  V  and  V  have  precisely  the 
same  interpretation  as  the  corresponding  modalities  of  if£S.  On  the  other  hand,  there 
is  no  obvious  way  to  characterize  the  minimal  conflict  relation  using  the  modality 
V.  In  this  connection,  we  can  establish  the  following  result. 

Theorem  2.2.  For  well-branching  models,  the  language  if£S  is  strictly  more  expressive 
than  if£S. 

Informally,  this  result  says  that  we  can  use  formulas  from  i?£s  to  differentiate 
models  which  are  indistinguishable  using  the  language  if£S. 

An  example  of  the  use  of  is  in  systems  where  agents  have  names,  like 
communicating  sequential  agents.  For  each  event  e  that  process  i  participates  in,  we 
can  assign  an  atomic  proposition  t,  to  the  local  configuration  [e.  Suppose  that  there 
are  n  agents  in  the  system,  with  “names”  Then  the  formula  A 

(r,- =>  V^tj)  expresses  the  fact  that  all  choices  in  behaviour  are  made  locally  by 
individual  agents. 

The  axiom  system  Ell  is  obtained  by  adding  the  following  axiom  schemes  to  the 
system  E. 


(Al)  (v)  V„j3), 

(A3)  (iii)  a  =>  V^V^a, 

(A6)  (ii)  V^a  =>  E  (Oa  V  A  a), 


(Deductive  Closure) 
(Symmetry  of  #M) 
(Minimal  Conflict) 


Al(v)  and  A3  (iii)  are  standard.  A6(ii)  is  the  characteristic  axiom  describing  the 
relation  as  the  minimal  conflict  relation. 

Let  a  denote  that  a  is  a  thesis  of  the  system  E ^  and  let  t=£s  a  denote  that  a  is 
valid  over  the  class  of  well  branching  models.  Then  we  get: 

Theorem  2.3.  Efl  is  a  sound  and  complete  axiomatization  of  the  valid  formulas  in  i?£s. 
In  other  words,  l-£  a  iff  k£s  a. 

3.3  Logic  for  communicating  sequential  agents 


We  now  wish  to  study  a  means  of  talking  about  a  central  feature  of  many  distributed 
systems  -  the  communication  pattern  between  the  components  of  the  system  that 
ensure  coordination.  For  this,  we  shall  define  a  logic  that  is  to  be  interpreted  over 
communicating  sequential  agents. 

Let  &  =  {p0,Pi , . . . }  be  a  countable  set  of  atomic  propositions,  and  FT  —  {t0,Ti,.  . . }, 
a  countable  set  of  type  propositions  disjoint  from  FP.  The  formulas  of  F£CSA  are  built 
up  as  follows: 

•  every  member  of  FPyjFT  is  a  formula  of  if CSA; 

•  if  a  and  /?  are  formulas  of  if CSA,  then  so  are  — i  a,  a  V  /?,  O.a  and  O(oc. 

The  formula  t,  asserts  that  the  observer  is  located  in  agent  i.  Of  and  <$>;  capture 
the  “visible”  future  and  past  of  agent  i.  This  will  become  clearer  when  we  define  the 
formal  semantics  of  these  modalities. 

A  frame  for  Z£Csa  *s  a  Pa*r  ( CSA,LCcsa ),  where  CSA=(E,  ^ ,rj )  is  a  system  of 
communicating  sequential  agents  and  LCcsa  is  the  set  of  local  states  of  CSA.  A  model 
is  a  pair  M  =((CSA,LCcsa),  V)  where  (CSA,  LCcsa)  is  a  frame  and  V:LCcsa  -+  FT) 
is  a  valuation  function  such  that 


r,e  V(\,e)  iff  ieq(e). 
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The  notion  M,J,el=a  can  be  defined  inductively  as  follows. 


(i)  a,  iff  aeF(je),  for  ae^u 

(ii)  i  a,  iff  M,[efoi. 

(iii)  M,  jeha  V  /?,  iff  M,J,eha  or  M,  jef /?. 

(iv)  M,[e¥  <$>;a,  iff  Be'eE,:^'  <  e  and  M,je'l=a. 


(v)  Af,  jcN  0;a  iff 


(eeEf-.Be'eEy.e^e'  and  M,je'Na. 
{e^Ef-.Ve'eEi:  if  e'  ^  e  then  M,[e’  1=  0,-a. 


Note  that  <S>,a  behaves  like  a  normal  past  modality  -  it  covers  all  events  that  lie  in 
the  i-past  of  e.  However  0;a  is  different:  in  agent  j,j  ^  i,  it  asserts  that  upto  the  last 
communication  from  i,  there  is  a  future  for  agent  i  satisfying  a.  In  case  there  is  no 
communication  from  agent  i  at  all,  agent  j  can  assert  0,a  for  any  formula  a. 

Define  Efa=fn  and  1  0(“ 1  a.  It  can  be  verified  that 

□  ;(*=>  <9>;  □((*  is  a  valid  formula  over  CSA.  It  asserts  that  an  invariant  formula  about 
an  agent  must  be  supported  by  a  communication  from  that  agent.  Thus  □,  is  a 
“strong”  modality  whereas  Of  is  “weak”  unlike  in  standard  modal  logic.  This 
asymmetry  arises  from  the  fact  that  in  distributed  systems,  the  past  of  other  agents 
can  be  completely  obtained  by  messages,  while  the  possibilities  for  the  future  are  only 
locally  known. 

Notice  that  the  formula  xt  A  tj  is  satisfied  at  a  local  state  J, e  only  if  {i,j}  £  g(e)  and 
thus  specifies  a  synchronization  between  agents  i  and  j.  The  infinite  set  of  formulas 
{t;  ^  —i  Tj\i  ^ j]  together  specify  that  each  event  is  in  at  most  one  agent  and  hence 
can  specify  asynchronous  CSA. 

Consider  the  formula  <$>;<*  A  <$>;/?  O^a  A  <$■;/?)  V  ^(/J  A  <S>;a).  This  speci¬ 
fies  that  agent  i  is  backwards  linear  -  during  a  computation  if  we  look  back  at  any 
two  events  involving  agent  i,  then  they  must  be  ordered.  This  captures  the  fact  that 
agents  in  a  CSA  are  sequential. 

.  Similarly,  the  formula  Ofa  3  <$>;(a  A  1  a  =>  S;- 1  a))  can  be  used  to  specify 
finitary  CSA,  i.e.  those  where  each  event  has  a  finite  past.  This  formula  asserts  that  if 
a  is  true  somewhere  in  the  past,  then  we  can  find  an  “earliest”  point  where  a  is  true. 

The  principal  advantage  of  this  logic  is  that  communication  between  agents  in  a 
distributed  system  can  be  easily  expressed:  ~iTt  A  ■$>,«  A  tj  can  be  used  to  specify 
that  i  has  communicated  the  truth  of  a  to  j  sometime  in  the  past. 

We  shall  present  a  detailed  example  of  reasoning  with  this  logic  at  the  end  of  this 
section.  First,  we  present  our  main  technical  results  for  this  logic. 

We  begin  with  logical  system  C  defined  below. 


The  system  C 


AXIOM  SCHEMES 


(AO)  All  the  substitutional  instances  of  the  tautologies  of  Propositional  Calculus. 
(Al)  (a)  Sj(a  Dj5)3(Qia3  (Deductive  closure) 

(b)  □;(<*  3  P)  =)(□,.«=)  □;/?) 

(A2)  (a)  T,o(0;a3a)  (Local  reflexivity) 

(b)  T;=3(n;a^a) 

(A3)  ■$>,.  ^>7-a  3  <S>ja 

(A4)  (a)  <S>;a  =>  Di  <$>ja 
(b)  0,-a  =>  Ef  0,a 


(Transitivity) 
(Relating  past  and  future) 
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(A5)  <$>fa  A  ^  <8>(.(a  A  <S>,y5)  V  <8>;(/?  A  ❖.a). 

(Backward  linearity) 

(A6)  Dja  =3  <$>{  D.-a 

(Communication) 

(A7)  (a)  StT, 

(Type  axioms) 

(b)  Tj  ZD 

INFERENCE  RULES 


(MP) 


a,  a  =3  f 


(TG0j) 


□  ,a’ 


(TGDi) 


r; 3  D-a 


Axioms  AO  to  A4  are  standard  axioms  suitably  modified  to  reflect  the  special 
interpretation  of  O,.  A5  asserts  that  individual  agents  are  sequential.  A6  captures 
the  fact  that  knowledge  about  another  agent’s  future  can  only  be  obtained  via 
communication.  A7  ensures  that  the  type  propositions  from  ST  are  assigned 
consistently.  The  rules  MP  and  TG0;  are  standard.  The  standard  form  of  the  rule 
TG  □;  will  not  preserve  validity  because  of  the  communication  requirement  imposed 
by  the  semantics  of 

Let  l-c  a  denote  that  a  is  a  thesis  of  the  system  C.  Let  Ncs>1  a  denote  that  a  is  valid 
over  the  class  of  models  based  on  CSA.  We  then  have  the  following  result. 


Theorem  3.1.  C  is  a  sound  and  complete  axiomatization  of  the  valid  formulas  of  Csa  ■ 
In  other  words  hc  a  iff  a  for  every  oceJ?CSA. 

When  we  introduced  CSA  in  §2,  we  had  defined  their  various  subclasses.  Let 
CSA  =  (E,  ^ ,  t])  be  a  CSA.  Recall  that  CSA  is  an  rc-CSA  if  t](E)  c  { 1, 2, . . . ,  n}  -  that  is, 
there  are  at  most  n  agents  in  the  system.  CSA  is  an  asynchronous-CSA  (ACSA)  if 
VeeE:\ri(e)\  =  1.  CSA  is  finitary  ifVeeF: [e  is  a  finite  set.  We  can  combine  these  notions; 
for  example,  an  n- ACSA  is  an  ACSA  with  a  bounded  number  of  agents.  Similarly,  we 
can  have  finitary  n- CSA,  finitary  ACSA  and,  finally,  finitary  n- ACSA.  Figure  10  pictorially 
represents  the  relationships  between  these  various  classes.  The  arrows  in  the  figure 
indicate  inclusion. 

Let  ^  denote  one  of  the  subclasses  of  CSA  mentioned  above.  Then  we  can  define 
the  notions  of  satisfiability  and  validity  relative  to  (€.  Thus,  a  formula  a  is  ^-satisfiable 
if  we  can  find  a  model  M  =  ((CSA,  LCcsa),  V)  for  a  such  that  CSAeW.  We  let  SAT % 
denote  the  set  of  ^-satisfiable  formulas  in  J? CSA.  a  is  ^-valid  if  it  is  valid  over  the 
class  of  models  based  on  frames  in  (€. 

We  can  axiomatize  the  ^-valid  formulas  for  all  these  subclasses.  The  required 
axiomatizations  are  obtained  by  suitably  combining  the  system  C  with  the  following 
axiom  schemes. 


Figure  10.  Subclasses  of  communicating  sequential  agents. 
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AUXILIARY  AXIOM  SCHEMES  AND  INFERENCE  RULES 


(A8)  V  t2V  ...  V  zn  (n  agents) 

(A9)  rf  =3  — i  t j,  for  i  j  (disjoint  agents) 

(A  10)  (a)  <S>;a  =>  A  H t-( — i  a  3  B,-- 1  a))  (well-founded  agents  and 

communications) 

(b)  <8>,-a  3  <$>,-(a  A  B^  B;~i  a),  for  i  ^  j. 

Theorem  3.2.  (1)  The  logical  system  Cj4=fC  +  ( A9)  is  sound  and  complete  for  the 
class  of  models  based  on  ACSA. 

(2)  The  logical  system  CFd=C  +  (A10)  is  sound  and  complete  for  the  class  of  models 
based  on  finitary  CSA. 

(3)  The  logical  system  CFAd=  CA  +  (A  10)  is  sound  and  complete  for  the  class  of  models 
based  on  finitary  ACSA. 

(4)  The  logical  system  C„d=  C  +  (48),  neN,  is  sound  and  complete  fior  the  class  of 
models  based  on  n-CSA. 

(5)  The  logical  system  CnA  =  CA  +  (A8),  neN,  is  sound  and  complete  for  the  class  of 
models  based  on  n-ACSA. 

def 

(6)  The  logical  system  CnF  =  CF  +  (A8),  ne N,  is  sound  and  complete  for  the  class  of 
models  based  on  finitary  n-CSA. 

(7)  The  logical  system  CnFA  —  CFA  +  (A8),  neN,  is  sound  and  complete  for  the  class  of 
models  based  on  finitary  n-ACSA. 

We  also  have  the  following  relationship  between  satisfiability  in  subclasses  with  an 
unbounded  number  of  agents  and  the  corresponding  subclasses  with  only  a  bounded 
number  of  agents. 


Theorem  3.3.  Let  <€  range  over  CSA,  ACSA,  finitary  CSA  and  finitary  ACSA.  Let  n^t, 
neN,  denote  the  corresponding  class  with  a  bounded  number  of  agents  n.  Then 
SATv=unSATM. 

We  now  give  a  detailed  example  of  how  communication  between  agents  can  be 
specified  in  CSA.  Consider  a  distributed  database  accessed  by  n  processes  which 
communicate  with  each  other  by  exchanging  messages.  A  protocol  is  needed  whereby 
the  processes  can  commit  to  a  distributed  transaction.  When  each  committed  process 
knows  that  all  the  others  have  also  committed  it  can  go  ahead  and  perform  its  local 
share  of  the  distributed  transaction.  For  this,  the  following  requirement  must  be  met. 

If  any  process  commits  to  the  transaction  then  it  eventually  knows  that  all  processes 
in  the  system  have  also  committed. 

Such  distributed  transaction  commit  protocols  commonly  arise  in  the  design  of 
distributed  systems  (Pinter  &  Wolper  1984,  pp.  28-37). 

We  now  specify  the  protocol  requirement  in  our  logical  language.  Let  (c x,..., cn} 
be  a  set  of  atomic  propositions,  jvhere  c;-  is  read  to  mean  “process  j  has  committed 
to  the  transaction”.  The  formula 


A  A  ct 


(1) 


expresses  the  requirement  above. 
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A  two-stage  implementation  of  this  protocol  may  use  two  local  boolean  variables 
in  each  process  P(-: 

•  a  variable  lt  in  which  process  P,  records  whether  it  can  participate  in  the  transaction 
or  not,  and 

•  a  variable,  which  we  also  call  ci5  to  record  the  commitment  of  the  process  to  the 
transaction. 

The  implementation  can  perhaps  run  as  follows: 

Process  P;: 

(1)  as  soon  as  a  local  decision  /,  is  made,  broadcast  /,  to  all  other  processes; 

(2)  when  lj  is  heard  from  all set  c,  to  True; 

(3)  as  soon  as  c,  is  set,  broadcast  it  to  all  other  processes; 

(4)  when  Cj  is  heard  from  all  j,  perform  transaction; 

(5)  acknowledge  all  incoming  messages. 

All  processes  follow  the  same  protocol  in  a  symmetric  manner.  This  is,  of  course,  a 
naive  protocol.  However,  our  aim  here  is  to  merely  illustrate  the  use  of  our  logical 
language.  Let  us  again,  by  abuse  of  notation,  use  to  denote  another  set  of 

atomic  propositions.  Consider  now  the  following  formulas: 


(2) 


“c;  is  set  True  only  when  lj  is  heard  from  all  other  processes  P”, 


(3) 


“if  c,  is  set,  then  it  will  be  broadcast  and  acknowledged”. 

Note  that  here  an  agent  has  to  assert  something  about  the  state  of  other  agents 
and  this  can  be  done  only  using  messages  from  them.  The  formula 
says  that  agent  i  has  received  an  acknowledgement  from  agent  j  of  the  message  c, 
sent  from  i  to  j.  This  is  necessary  because  we  assume  that  messages  may  be  lost  in 
this  network. 

It  is  easy  to  verify  that  the  formulas  (2)  and  (3)  together  imply  the  requirement  (1) 
above.  In  fact,  we  can  use  the  axiom  system  C  and  logically  deduce  the  requirement 
from  (2)  and  (3).  This  verifies  that  the  simple  protocol  above  meets  its  specification. 

Note  that  the  protocol  above  works  for  only  one  transaction,  in  the  sense  that  the 
commitment  is  stable;  once  a  process  commits  to  the  transaction,  it  stays  committed. 
When  a  protocol  is  needed  for  several  transactions,  we  can  index  the  transactions  by 
sequence  numbers  and  modify  the  specification  above  appropriately. 

While  the  preceding  example  illustrates  the  specification  of  a  protocol  which 
assumes  complete  connectivity  in  the  network  of  communicating  agents,  we  can  also 
specify  protocols  which  demand  specific  patterns  of  connectivity.  Since  agents  are 
syntactically  mentioned  in  formulas,  this  logic  is  particularly  suited  for  describing 
communications  which  name  specific  agents.  We  illustrate  this  point  with  another 
detailed  example. 

Assume  that  processes  P0,Pi,...,Pn^1  are  connected  in  a  ring  and  communicate 
with  each  other  only  by  exchanging  messages.  A  process  P{  can  communicate  only 
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with  its  neighbours  j  and  Pi+l  on  the  ring.  Here  and  in  the  sequel,  addition  and 
subtraction  are  assumed  to  be  modulo  n. 

Assume  that  each  process  E,  maintains  a  variable  x;  taking  values  in  N  and  whose 
value  initially  is  vh  for  0  ^  i  ^  n  —  1.  It  is  described  to  specify  a  distributed  protocol 
which  computes  the  greatest  common  divisor  (GCD)  of  the  values  v0,...,v„-1.  Let 
result  denote  the  value  of  the  constant  gcd(v0,v1,...,v„-1).  When  the  computation 
terminates,  the  variables  xh  ie{0, ...,n—  1}  should  satisfy 

x0  =  xi  —  •  ■  ■  —  xn- 1  =  result. 

Since  our  logical  language  is  propositional  in  nature  we  cannot  express  values  of 
variables  and  hence  assume  countably  many  propositions  Zf ,  /ceN,  to  denote  “x;  —  k ”. 
With  this  understanding  we  write  such  propositions  as  equalities.  Similarly  we  assume 
propositions  to  denote  “fc  <  /”,  “k  =  i  —  /’  etc.  The  protocol  requirement  is  then 
specified  by 


A  (T;  A  (x;  =  Uf)  =>  O;  A  <$>t(xfc  =  result)). 

i  k 

An  algorithm  for  computing  the  GCD  can  be  described  as  follows:  process  Ph  at  any 
state,  compares  the  current  value  of  xh  with  the  current  values  of  its  neighbours,  xi^1 
and  xi+1.  In  case  x;  is  smaller,  nothing  needs  to  be  done;  if  x;_!  is  smaller,  x;  is 
updated  to  be  x,  — x;_t;  similarly,  if  xi  +  1  is  smaller,  x;  is  updated  to  be  x,  — xi  +  1. 
Whenever  the  value  of  x;  changes,  this  is  communicated  to  the  neighbouring  processes. 
Eventually,  all  values  stabilize  at  the  greatest  common  divisor. 

As  before,  we  assume  that  messages  may  fail  and  hence  received  messages  are 
always  acknowledged.  Let  O^joc  abbreviate  the  formula  -q  A  0;  <S>(-a.  (In  some 

sense,  this  stands  for  “i  sends  the  message  a  to  j  and  receives  an  acknowledgement”.) 
Our  protocol  can  now  be  specified  as 

A  (t,=j  DfS  A  Q{<5) 

i 

where  5  =f<50  A  A  52  A  <53  is  given  by: 

S0:{Xi  =  v=>  A  Oi^j{xi  =  v)) 

je{i- l,i  +  l} 

“neighbours  are  always  kept  informed  of  current  x;  value” 

^^(Xj  =  v  =)  □,(xi  =  v'  3  v'  ^  y)) 

“values  are  never  increased” 

<52  :(xj  =  v  A  <$,j_1(xi_1  =  v')  A  v'  <  v  O^x,-  =  v"  A  v"  —  v  —  t/)) 

“if  x,  _ !  <  xf  then  x;:  =  xt-  —  xf_  x” 

:(xt  =  v  A  0i+1(xi+1=t)')Ai)'<ii3  0;(xi  =  v"  A  v"  =  v  —  v')) 

“if  xi+1  <  Xi  then  xf:  =  xt  —  xi+1”. 

It  is  easy  to  see  that  this  specifies  a  distributed  implementation  of  Euclid’s  algorithm 
for  computing  the  GCD. 
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4.  Discussion 

In  this  paper,  we  have  looked  at  models  for  distributed  systems  which  emphasize 
their  nonsequential  behaviour  and  considered  their  logical  characterization  using  an 
assortment  of  modal  logics. 

A  fair  amount  of  theory  has  been  developed  for  the  models  we  have  considered. 
Our  notion  of  a  distributed  transition  system  is  only  one  of  several  that  have  been 
considered;  alternative  formulations  include  those  of  Degano  &  Montanari  (1987) 
and  Boudol  &  Castellani  (1988).  Stark  (1989)  had  defined  a  related  class  of  models  called 
concurrent  transition  systems.  In  net  theory,  more  general  net  systems  include  Petri 
nets,  predicate/transition  nets  and  coloured  nets  (Brauer  et  al  1987).  As  far  as  event 
structures  are  concerned,  we  have  only  considered  prime  event  structures  in  this 
paper;  other  classes  of  event  structures  include  stable  event  structures  and  general 
event  structures  (Winskel  1987.  pp.  325-392)  as  well  as  flow  event  structures  (Boudol 

1990,  pp.  62-95).  Systems  of  communicating  sequential  agents  were  introduced  in 
Lodaya  et  al  (1989b),  as  a  generalization  of  the  n-agent  event  structures  described  in 
Lodaya  &  Thiagarajan  (1987,  pp.  290-303). 

The  models  that  we  have  dealt  with  in  this  paper  are  closely  related  to  each  other. 
We  have  described  how  labelled  net  systems  and  labelled  event  structures  give  rise 
to  distributed  transition  systems  in  a  natural  way.  A  strong  relationship  also  exists 
between  elementary  net  systems  and  prime  event  structures  (Nielsen  et  al  1980,  1990). 
The  connection  between  CSA  and  event  structures  is  described  in  Lodaya  et  al  (1989b). 
By  establishing  formal  connections  between  models  in  this  manner,  we  can  translate 
results  obtained  using  one  class  of  models  to  other  classes. 

As  for  the  logics  that  we  have  described  here,  the  main  results  that  we  have  are 
sound  and  complete  axiomatizations  for  different  classes  of  models  (see  Lodaya  et  al 
1987,  1989a,  pp.  508-522,  1989b,  1991,  Mukund  &  Thiagarajan  1989,  pp.  143-160, 

1991,  and  Mukund  1990).  For  the  logic  for  distributed  transition  systems,  we  also 
have  various  decidability  and  undecidability  results  (Lodaya  et  al  1991).  However, 
for  the  logics  for  event  structures  and  CSA,  the  decidability  question  remains  open. 

Several  attempts  have  been  made  to  use  logics  to  characterize  the  behaviour  of 
distributed  programs.  Temporal  modalities  have  been  traditionally  interpreted  over 
different  types  of  tense  structures  (Burgess  1980,  1984,  pp.  89-133).  Using  the 
interleaving  approach  to  modelling  concurrency,  various  authors  have  used  temporal 
logics  defined  on  sequences  and  trees  to  describe  concurrent  computations  (see  e.g. 
Pnueli  1977,  pp.  46-57,  Gabbay  et  al  1980;  Clarke  et  al  1986).  Pinter  &  Wolper  (1984, 
pp.  28-37)  have  extended  this  work  to  true  concurrency  by  explicitly  using  partial 
orders  to  represent  concurrent  computations.  Katz  &  Peled  (1989,  pp.  489-507)  have 
defined  a  first-order  temporal  logic  over  sets  of  partial  orders. 

However,  the  use  of  classes  of  behavioural  structures  for  distributed  systems  as 
frames  for  logics  seems  to  be  relatively  new.  Penczek  (1988)  has  used  event  structures 
as  frames  and  is  the  first  to  use  an  explicit  modality  to  represent  conflict.  Reisig  (1986, 
pp.  603-627)  is  working  on  logics  which  directly  use  elementary  net  systems  as  frames. 
Christiansen  (1989)  has  worked  with  CSA-like  frames;  he  uses  an  indexed  A  modality 
in  his  logic  to  describe  concurrency  across  agents. 

Trace  theory  is  a  language  theoretic  approach  to  describing  concurrency  which  we 
have  not  considered.  This  formalism  also  gives  rise  to  models  of  distributed  systems 
with  true  concurrency.  Here,  along  with  an  alphabet  of  actions,  one  is  given  an 
independence  relation  declaring  which  actions  in  the  system  are  concurrent.  Instead 
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of  viewing  a  computation  as  a  string  of  symbols  from  the  alphabet,  one  now  considers 
sequences  made  up  of  sets  of  concurrent  actions  (sequences  of  concurrent  steps,  in 
our  framework),  which  are  called  traces.  Like  strings,  traces  form  a  monoid,  called  a 
partially  commutative  monoid,  and  so  one  can  meaningfully  talk  about  trace 
languages.  A  syntactic  Kleene-like  characterization  of  regular  trace  languages  has 
been  given  by  Ochmanski  (1985),  while  a  characterization  in  terms  of  automata  has 
been  obtained  by  Zielonka  (1987).  The  pomsets  of  Gischer  and  Pratt  (Pratt  1986)  are 
similar  to  traces. 

Logics  for  trace  theory  have  not  been  considered  in  the  literature.  We  believe  that 
results  like  the  ones  in  §3.1  can  be  obtained  (Lodaya  et  al  1991). 

Another  widely  prevalent  approach  to  modelling  concurrency  is  algebraic.  One 
way  of  describing  sequential  nondeterministic  programs  is  through  regular  expressions, 
by  interpreting  the  operators  •,  +  and  *  as  sequential  composition,  choice  and 
iteration.  Similarly,  in  the  algebraic  approach  to  concurrency,  one  introduces  an 
operator  to  denote  the  parallel  composition  of  programs.  Program  behaviour  is 
specified  by  modelling  the  language  operators  in  an  appropriate  semantic  domain. 
Popular  languages  for  concurrency  include  CSP  (Hoare  1984),  CCS  (Milner  1989)  and 
ACP  (Bergstra  &  Klop  1984),  and  the  models  most  often  used  are  transition  systems 
(Plotkin  1981)  and  equational  algebras  (Bergstra  &  Klop  1984).  Most  of  this  work 
has  been  based  on  interleaving  models  and  only  recently  have  attempts  been  made 
to  give  a  “truly  concurrent”  semantics  to  these  languages  (Olderog  1987,  pp.  196-223, 
van  Glabbeek  &  Vaandrager  1987,  pp.  224-242,  Degano  et  al  1989,  pp.  438-466). 
An  earlier  denotational  semantics  using  event  structures  as  domains  was  given  in 
Winskel  (1982,  pp.  561-577). 

In  this  framework,  Hennessy  &  Milner  (1985)  have  used  action-indexed  logics  to 
characterize  computations  of  sequential  nondeterministic  systems.  Assuming  an 
interleaving  model  of  concurrency,  this  characterization  extends  to  the  computations 
of  distributed  systems.  This  work  has  been  considerably  extended  by  Stirling  (1987). 
However,  the  emphasis  here  is  on  axiomatizing  program  equivalences  using  equational 
logic.  Our  use  of  action-indexed  logics  for  models  exhibiting  true  concurrency  is 
inspired  by  this  work,  but  we  have  concentrated  on  axiomatizing  the  valid  formulas, 
as  is  traditional  in  logic. 

Logics  in  which  the  modalities  are  indexed  by  programs,  rather  than  just  actions, 
arose  in  the  framework  of  program  verification  (Hoare  1969).  Programs  with  parallel 
composition  operators  have  been  considered  by  several  authors  (e.g.  Apt  et  al  1980, 
Moitra  1983).  Dynamic  logics,  originally  defined  over  sequential  programs  (Harel  1984), 
have  been  extended  with  an  operator  for  intersection  to  model  synchronization  (Peleg 
1987).  However,  a  lot  of  work  remains  to  be  done  on  characterizing  models  for  true 
concurrency  using  program-indexed  logics. 
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Abstract.  We  correlate  the  level  of  knowledge  of  certain  formulas  in  a 
group  of  individuals  with  certain  regular,  downward  closed,  sets  of  strings. 
We  show  that  in  suitable  circumstances,  all  such  sets  can  occur  as  levels 
of  knowledge  but  that  the  lack  of  synchrony,  or  the  lack  of  asynchrony 
when  there  are  only  two  processors  in  the  group,  can  create  more  or  less 
severe  restrictions. 

Keywords.  Levels  of  knowledge;  distributed  systems;  common  knowledge. 


1.  Introduction 

It  has  been  suggested  recently  that  the  notions  of  knowledge  and  common  knowledge 
may  be  useful  in  analysing  the  behaviour  of  distributed  systems  and  in  designing 
protocols  (Parikh  &  Ramanujam  1985,  pp.  256-268;  Chandy  &  Misra  1986;  Parikh 
1986,  pp.  322-331;  Halpern  &  Zuck  1987,  pp.  269-280;  Moses  &  Tuttle  1988;  Halpern 
&  Moses  1990). 

As  specific  examples,  we  cite  the  paper  by  Moses  &  Tuttle  (1988)  who  proved  that 
certain  synchronized  action  problems  require  common  knowledge,  and  that  there  is 
always  a  most  efficient  solution  which  is  an  implementation  of  a  simple  knowledge 
based  algorithm  -  an  algorithm  where  there  are  explicit  tests  of  knowledge.  This 
algorithm  is  of  the  form  “repeat...  until  CVA ”  for  a  certain  formula  A} 

Again,  it  was  shown  by  Halpern  &  Zuck  (1987,  pp.  269-280)  that  if  a  sequence  of 
bits  is  communicated  in  an  asynchronous  system  where  messages  can  be  delayed  or 
lost  (but  if  they  are  received,  they  are  received  in  order),  then  to  prove  correctness 
of  the  protocol  it  is  necessary  and  sufficient  to  prove  that  KsKrKsKr(“value  of  the 
current  bit")2  is  true  whenever  the  sender  sends  the  next  bit.  Unlike  the  case  with 
Moses  &  Tuttle  (1988),  common  knowledge  is  not  necessary,  nor  would  it  be  attainable 
if  needed. 


1  CVA  means  that  there  is  common  knowledge  among  the  processes  in  U  that  A  is  true.  The 
following  important  property  of  common  knowledge  is  used  in  the  synchronization  algorithms: 
common  knowledge  is  always  achieved  simultaneously  by  all  the  processes  involved. 

2  The  sender  knows  that  the  receiver  knows  that  the  sender  knows  that  the  receiver  knows 
the  value  of  the  current  bit. 
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Other  places  where  these  notions  have  been  found  to  be  important  are  the  semantics 
of  natural  language  (Lewis  1969;  Schiffer  1972)  and  mathematical  economics  (Aumann 
1976;  Parikh  &  Krasucki  1990). 

Since  there  are  useful  protocols  which  can  be  specified  in  terms  of  knowledge 
formulae,  formulae  of  the  form  KilKi2...KimA,  where  Kt.  are  knowledge  operators, 
we  investigate  which  sets  of  such  formulae  can  occur  as  states  (or  levels)  of  knowledge 
for  some  formula  A. 

We  define  a  logic  of  knowledge  by  augmenting  some  base  logic  L(e.g.  propositional 
logic)  with  modal  operators  K{  for  i  sg  n.  For  every  process  i,  i  knows  A  if  and  only 
if  according  to  f s  view  of  the  world,  A  must  be  true. 

We  assume  that  processes’  information  is  always  correct,  although  often  incomplete. 
Processes  may  not  be  able  to  describe  precisely  the  global  state  of  the  system,  some 
states  may  be  indistinguishable  to  them,  but  they  always  admit  the  real  state  of  the 
system  as  one  of  the  possibilities. 

This  leads  to  a  Kripke  semantics  where  every  process  i  has  some  indistinguishability 
relation  Rt  and  in  the  state  s,  process  i  knows  that  a  formula  A  is  true  iff  for  all  the 
states  s'  which  look  the  same  to  i  as  s,  A  is  true  in  s'.  Since  we  deal  with  distributed 
systems,  and  we  want  to  carry  all  the  information  about  the  past  in  our  states,  we 
will  call  the  states  histories.  A  history  H  is  a  sequence  of  snapshots  of  the  system,  i.e. 
a  sequence  of  ^-tuples  of  actions  ( n  is  the  number  of  processes),  where  a  snapshot  is 
taken  at  every  tick  of  a  very  fine  clock.  If  a  process  does  nothing,  its  action  is  taken 
to  be  the  special  null  action. 

A  first  natural  problem  is  the  characterization  of  the  sets  of  strings  of  knowledge 
operators  x  such  that  for  some  fixed  formula  A  and  history  H,  H\=  xA. 

This  is  a  purely  logical  problem.  The  results  are  valid  for  every  model  of  distributed 
systems  where  the  indistinguishability  relation  Rt  for  every  process  i  is  an  equivalence 
relation. 

These  sets  of  knowledge  formulae  (formulae  of  the  form  KiKj....A),  we  will  call 
levels  of  knowledge. 

The  second  problem  is  the  design  of  a  protocol,  when  possible,  in  which  all  the 
knowledge  formulae  (and  exactly  the  knowledge  formulae)  in  the  set  are  satisfied. 

To  solve  this  problem  we  need  to  design  a  model  of  distributed  system  in  which 
we  can  control  the  acquisition  of  knowledge.  We  need  a  system  in  which  there  is  no 
accidental  knowledge  (or  accidental  synchrony),  a  system  in  which  all  the  knowledge 
of  the  process  about  the  others  must  be  a  result  of  some  communications. 

If  processes  in  a  distributed  system  know  each  other’s  programs,  the  lack  of 
knowledge  in  the  system  may  be  due  to:  non-determinism,  inputs,  faults,  communica¬ 
tions  and  asynchrony. 

In  our  system,  lack  of  knowledge  will  be  the  result  of  three  factors:  initial  private 
inputs  of  processes,  lack  of  synchrony,  possible  delays  in  the  communication  system. 

It  turns  out  that  the  existence  of  a  protocol  depends  on  different  assumed  means 
of  communication  available  in  a  distributed  system. 

We  will  analyse  systems  where  all  communications  are  asynchronous  (there  is 
uncertainty  about  the  delivery  time  of  a  message);  systems  where  all  communications 
are  instantaneous  (synchronous  systems);  and  systems  where  both  types  of  communica¬ 
tions  are  available. 

The  paper  will  consist  of  three  main  parts: 

(1)  Logic  and  levels  of  knowledge  -  description  of  the  logic,  definition  of  the  level  of 
knowledge  of  a  formula,  characterization  of  levels  of  knowledge. 
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(2)  Model  of  a  distributed  system  -  definition  of  the  distributed  system  in  which  all 
knowledge  about  other  processors  is  acquired  through  communication. 

(3)  Realization  of  levels  of  knowledge  in  distributed  systems  -  given  a  set  which  may 
be  the  level  of  knowledge  of  some  formula  we  design  a  protocol  which  realizes 
precisely  that  level. 

Our  work  is  a  continuation  and  extension  of  the  research  initiated  by  Parikh  (1986, 
pp.  322-331)  and  it  includes  the  results  from  Parikh  (1986,  pp.  322-331)  as  well  as 
Parikh  &  Krasucki  (1986). 


2.  Logic  and  levels  of  knowledge 

A  protocol  is  a  set  of  global  histories  where  global  histories  are  finite  sequences  of 
events  in  the  system,  which  are  prefixes  of  runs  of  the  system.  L0  is  a  language  which 
describes  properties  of  the  global  histories  in  protocol  P.  For  every  sentence  A  in  L0, 
and  for  every  history  He P,  A  is  either  true  or  false  in  H. 

We  want  to  make  sure  that  every  processor’s  private  information  is  expressible  in 
our  language.  To  accomplish  that  we  assume  that  we  have  in  our  language  a  countable 
set  of  propositions  Qt j,  where  Qt  tj  is  the  proposition  that  the  jth  input  value  of 
processor  i  is  1.  The  j  are  independent  but  only  finitely  many  can  be  true.  The 
truth  values  of  the  Quj  are  the  private  facts  of  individual  i.  When  we  do  not  care 
about  the  ‘owners’  of  these  facts  QUj  then  we  shall  refer  to  them  just  as  the  ‘private 
facts’  and  denote  them  as  Pk. 

L  is  the  closure  of  L0  under  truth  functional  connectives.  L  can  be  extended  to  a 
larger  language  LK  which  is  the  closure  of  L  under  the  knowledge  operators  Kt  (for 
ieJV)  and  the  usual  truth  functional  connectives.  Here  KfA)  means  that  processor  i 
knows  A. 

The  class  of  all  models  we  consider  is  the  class  of  all  protocols  P  as  described  in 
the  next  section.  Let’s  fix  P.  Now  we  define  the  notion  H¥  A  for  A  in  LK  by  recursion 
on  the  complexity  of  A.  We  assume  that  for  every  process  there  is  some  equivalence 
relation  defined  on  histories.  H  means  that  according  to  i’s  view  H  and  H' 
are  indistinguishable. 

(0)  If  A  is  from  L0  then  the  semantics  is  given. 

(1)  If  A  is  Qi  j  then  A  is  true  in  H  if  the  y'th  bit  of  the  input  of  processor  i  in  H  is  1: 

HVA,  iff  H  =  (v1,...,v„);H',  {Vi)j=  1. 

(2)  A  is  i  v4/  then  H  h  A,  iff  H  f  A'. 

If  A  is  B  VC  then  H¥A,  itt(H¥B  or  H¥C). 

(3)  HtKiA,  iffV/TeP  -+H'¥A. 

We  will  sometimes  also  need  to  refer  to  common  knowledge  operators  Cv  where 
U  cr  N.  If  A  is  of  the  form  CV(B),  then  A  is  true  in  H  iff  B  is  true  in  every  H'  such  that 
there  is  a  chain  of  processors/  x,...,  ik  in  U  and  a  chain  of  histories  H  =Hl,...,Hk,Hk  +  l  = 
H',  such  that  for  every  1  ^k,HjKijHJ+1. 

We  also  define  a  new  relation  %  v  as  follows: 

H^VH'  iff  3 ieU 

Then  will  be  the  reflexive,  transitive  closure  of  xv,  and  we  have  the  following 
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equivalent  semantics  for  the  common  knowledge  operator: 

H¥CVA,  iffVH'eP,  Hx*H'^H'¥A. 

Note  that  if  U  is  empty,  then  H  k  Cv  A  iff  H  h  A,  since  ^  *  is  then  the  identity  relation. 
For  a  set  X  of  formulas,  we  will  write  H¥X  to  mean  that  for  all  AeX,  H¥  A. 

Since  the  are  equivalence  relations,  the  following  S5  axioms  and  rules  hold. 

(1)  All  tautologies. 

(2)  KM)  A  K-XA^B^KAB). 

(3)  K,{A)^A. 

(4)  KiW-'UKiiA)). 

(5)  -iKAA^KA-iKAA)). 

The  rules  modus  ponens  and  necessitation  are  sound  where  the  latter  allows  us  to 
infer  a  formula  Kt(A)  provided  that  we  have  shown  that  the  formula  A  holds  for  all 
histories  H.  This  logic  in  the  case  when  L0  is  propositional  is  known  as  LK 5. 

The  same  axiom  system  will  also  be  sound  if  we  replace  Kt  by  Cv.  In  this  case  we 
can  add  one  more  axiom: 

(6)  V^U-*{CvA-*CvA) 

If  U  =  {i}  then  and  coincide  since  is  transitive. 

In  the  following  characterization  of  levels  of  knowledge  we  will  not  need  all  the 
axioms:  axiom  5  will  not  be  used,  so  our  results  are  valid  also  for  the  system  in  which 
are  reflexive  and  transitive  but  not  necessarily  symmetric  (the  logic  satisfies  S4 
axioms). 

2.1  Levels  of  knowledge 

In  this  section  we  show  how  to  define  the  level  of  knowledge  of  a  formula  as  a  set  of 
strings  over  the  alphabet  S  where  1,  =  {K1,...,Kn}  and  investigate  properties  of  levels 
of  knowledge. 

Consider  a  formula  A  and  some  global  history  H.  The  formula  may  be  true  but 
not  known  to  anyone.  In  that  case  we  have  H¥A  but  not  H¥KfA)  for  any  i.  Or 
perhaps  it  may  be  known  to  some  i  that  A  is  true.  In  the  latter  case,  KfA)  is  true 
and  A  will  hold  at  all  histories  H'  such  that  H '  «;//.  The  formula  KjKfA)  expresses 
the  stronger  assertion  that  not  only  is  A  known  to  i,  but  also  that  this  fact  is  known 
to  j.  Still  more  is  known  in  the  system  if  among  i  and  j  both  know  that  both  know, 
that  both  know  A.  This  is  common  (or  mutual)  knowledge  of  A  between  i  and  j  and 
is  denoted  by  C{lj](A).  Thus  a  formula  that  is  known  may  be  known  at  a  higher  (in 
some  sense)  or  lower  level. 

The  highest  possible  level  of  knowledge  here  is  CN(A),  A  is  common  knowledge 
for  the  whole  group,  which  holds  if  for  all  strings  .x  of  knowledge  operators,  xA  holds. 
We  shall  give  now  the  precise  definition  of  the  level  of  knowledge,  but  first  let’s  look 
at  the  set  T(A,H)  of  all  strings  of  knowledge  operators  x  such  that  xA  is  true  in  H : 

T(A,  H)  =  {x|xe£*  and  H¥xA}. 

If  there  is  some  non-empty  string  x0  in  T(A,H),  then  T(A,H )  is  infinite.  This  is  a 
consequence  of  the  following  theorem: 
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Theorem  1.  (a)  Let  £  be  the  alphabet  whose  symbols  are  {K1,...,Kn}.  For  all  a  in 

£,  and  for  all  x,y,  in  £*,  and  all  formulae  A, 

hxayA<->xaayA, 

and  hence  for  all  H,  HtxayA  iff  H'vxaayA.  That  is,  repeated  occurrences  of  a  are 
without  effect  and  if  xaye  T(A,H )  then  Vn  xanye  T{A,H). 

(b)  For  a  subset  U  of  {1,. let  A  =  {Ki\ieU}.  Then  for  all  histories  H,  and 
formulae  A,  H\=CV(A)  iff  for  all  strings  xeA *,  HtxA. 

Proof,  (a)  is  straightforward  using  the  fact  that  is  an  equivalence  relation.  Let 
a  =  Kt.  Then  HtayA  iff  \/Fl',H' « ,//,//' \=yA.  But  by  transitivity  of  «£  this  yields: 
VH',  H'  7a  iH^H"  ,H"  7niH',H"'vyA,  i.e.  H^aayA.  Since  ay  A  and  aayA  are  equivalent, 
it  is  easily  seen  that  so  are  xayA  and  xaayA.  The  proof  of  (b)  follows  from  the  fact 
that  the  relation  is  the  transitive  closure  of  u(«£):ie A.  □ 

Since  occurrences  of  substrings  of  the  form  K,K,  don’t  carry  any  more  information 
than  strings  Kt  we  define  levels  of  knowledge  of  formulae  by  excluding  all  strings 
containing  consecutive  occurrences  of  the  same  knowledge  operator. 

DEFINITION  1 

(a)  A  string  x  is  simple  if  x  contains  no  substrings  K£/C£. 

(b)  Given  a  formula  A  and  a  history  H,  the  level  L(A,  H)  of  A  at  H  is  the  set  of  all 
simple  x  in  £*  such  that  H^xA. 

If  H  is  clear  from  the  context,  or  not  important,  then  we  shall  drop  it  as  a  parameter. 
The  set  of  simple  strings  on  an  alphabet  £  will  be  denoted  £s.  Thus  L{A,H)  will 
always  be  a  subset  of  £s. 

2.2  Embeddability 

Now  we  will  try  to  characterize  levels  of  knowledge.  First  we  need  to  introduce  the 
embeddability  ordering  on  strings  which  turns  out  to  be  important. 

DEFINITION  2 

Given  two  strings  x  and  y,  we  say  that  x  is  embeddable  in  y  (x  ^  y),  if  all  the  symbols 
of  x  occur  in  y,  in  the  same  order,  but  not  necessarily  consecutively.  Formally:  Let 
x  =  ax---am  and  y  =  bx---bp.  Then  x  is  embeddable  in  y  iff  there  is  a  function  /  from 
{l,...,m}  to  {!,...,  p}  such  that  Vi  <j  ^  m,/(i)  <  /(;)  and  a,  =  h/(i). 

Thus  the  string  aba  is  embeddable  in  itself,  in  aaba  and  in  abca,  but  not  in  aabb. 
Properties  of  the  embeddability  relation  ^ 

Fact  1.  Embeddability  is  a  well  partial  order,  i.e.  it  is  not  only  well  founded,  but 
every  linear  order  that  extends  it  is  a  well  order  (equivalently,  it  is  well  founded  and 
every  set  of  mutually  incomparable  elements  is  finite). 
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Fact  2.  Embeddability  can  be  tested  in  linear  time  by  a  two  tape  Turing  machine. 

For  a  proof  of  fact  1,  see  Higman  (1952)  and  deJongh  &  Parikh  (1977).  Fact  2  is 
straightforward. 

We  also  need  a  weaker  (larger)  relation  defined  on  £*,  which  we  call  K-embeddability. 
DEFINITION  3 

We  define  the  K-embeddability  relation  as  follows: 

If  x  =  al---am  and  y  =  bl---bp  are  elements  of  {Kx,. .  .,Kn}*,  then  x  is  K-embeddable 
in  y  iff  there  is  a  function  /  from  {l,.  ,.,m}  to  {1, . . .  ,p}  such  that  Vi  ^  ^  m,f(i)  ^  f(j) 
and  at  =  bf(i).  • 

The  condition  defining  is  weaker  than  that  defining  Hence  the  relation 
extends  the  relation  <  so  that  aaba  aba,  but  aaba  aba.  However,  for  ali  simple 
x  and  for  all  y,  x  ^  y  iff  x  y.  Given  a  string  x,  there  is  a  shortest  simple  y  such  that 
x  y.  We  shall  denote  y  as  Sim(x),  the  simplification  of  x.  For  example,  if  x  =  abbacbbaa 
then  Sim(x)  =  abacba. 

DEFINITION  4 

A  downward  closed  subset  of  £s  =  (j KU...,K„}S  is  a  subset  X  such  that  if  xeX  and 
y  =^x,  then  yeX. 

Theorem  2.  For  all  strings  x=^yeEs,  all  formulae  A  and  for  all  histories  H,  if  H'pyA 
then  H^xA. 

Proof.  We  use  induction  on  the  sum  of  lengths  of  x  and  y.  If  the  sum  is  0,  then  the 
lemma  is  immediate.  Otherwise  y  must  be  nonempty  and  let  y  be  Kty'.  Now  either 
x^y'  or  x  is  Ktx'  for  some  i  and  x'  =^y'.  In  the  first  case  yA  implies  y'A  which  (by 
induction  hypothesis)  implies  xA.  In  the  second  case,  y'A  implies  x'A  by  induction 
hypothesis,  and  therefore  by  necessitation  Kty'  A  implies  Ktx'  A,  so  we  get  yA  implies 
xA.  □ 

COROLLARY  1 

Every  level  of  knowledge  is  a  set  of  simple  strings,  downward  closed  with  respect  to 
the  order  =^. 

COROLLARY  2 

The  complement  of  every  level  of  knowledge  is  upward  closed  with  respect  to 

So  far  we  have  a  necessary  condition  for  the  set  of  strings  of  knowledge  operators 
to  be  the  level  of  knowledge  of  some  formula  in  some  history.  We  can  infer  for 
example  that  there  is  no  formula  A  and  history  H,  such  that  H\=K2KlA  and 
HN  i  K2A.  This  is  because  if  K2Kle  L(A,H)  then  since  L{A,H)  is  downward  closed, 
K2  is  also  in  it.  In  this  case,  we  could  also  have  seen  this  fact  directly  by  deriving 
K2A  from  K2K1A,  but  other  cases  might  be  more  subtle.  Thus  we  will  need  the 
notion  of  the  smallest  downward  closed  set  of  strings  of  knowledge  operators  including 
the  given  set  X.  We  will  call  this  set  the  downward  closure  of  X  and  denote  it  as  dc(X). 
We  start  by  investigating  some  properties  of  the  operation  of  downward  closure. 
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2.3  Downward  closures 

DEFINITION  5 

The  downward  closure  dc(X),  of  X  £  Es  is  the  smallest  set  Y  £  Xs  such  that  X  £  Y 
and  for  all  x,  if  xe  Y  and  y=?Jx  then  ye  Y. 

Properties  of  downward  closure 

Facts  3  and  4  depend  only  on  the  fact  that  is  a  partial  order. 

Fact  3.  If  B  is  downward  closed  then  for  all  A,  A^B  iff  for  every  xeA  there  is  yeB 
such  that  x=^y. 

Fact  4.  dc(A  \j  B)  —  dc(A)u  dc(B ) 

Fact  5.  dc(A:  B)  —  Sim(dc(A):dc(B))  where,  for  X  £  £*,  Sim(X)  =  {Sim(x)\xeX}. 
Obviously  Sim(X)  c  I5. 

In  future,  we  will  omit  the  operator  Sim  in  contexts  where  non-repeating  strings 
are  the  only  strings  involved,  stipulating  that  for  subsets  X,  Y  of  Xs,  X;  Y  will  indicate 
Sim(X;  Y)  where  the  second  denotes  concatenation  of  X,  Y  as  subsets  of  £*.  As 
will  denote  Sim(A*). 

Fact  6.  dc(As)  =  Us  where  U  =  {er|creE, 3xeA  <r=?Jx}. 

Facts  3,  4  and  5  are  straightforward.  To  prove  fact  6  first  notice  that  for  any  U 
Us  is  downward  closed,  and  clearly  dc{As)  £  Us.  To  show  that  Us  dc(As)  let’s  assume 
that  in  fact  there  is  some  string  xeEs,  such  that  xeUs  and  x£dc(As).  Say  x  =  at  a2  ■  ■  ■  an, 
where  for  every  ah  u,eS  and  there  is  xteA,  such  that  at  ^  xt.  x1x2---xneAn,  therefore 
x1x2---xneAs.  Clearly  dc({x1x2  ■••x„})  c  dc(As),  but  by  fact  5,  dc({x1x2  •••x„})  = 
dc({x1})---dc{{xn}).  So  since  a^x,-  for  all  i=\,...,n  then  a,edc({xj)  for  all 
i=  1 ,...,«,  so  x  =  ala2  ■•■anedc{{x1x2  •••x„})  £ dc(As)  and  we  get  a  contradiction. 

Fact  7.  If  ax  ^  by  and  a^b  then  ax  y. 

2.4  Characterization  of  levels  of  knowledge 

Now  we  look  at  the  possibility  of  characterization  of  levels  of  knowledge. 

Let  L(A,  H )  —  L(A)  be  a  level  of  knowledge.  Let  L(A)  denote  the  complement  of 
L(A)  with  respect  to  Xs.  Then  L(A)  is  upward  closed  and  under  each  element  y  of 
L(A)  there  is  a  minimal  element  x.  Let  m(/4)  =  {x1,...,xk}  be  the  set  of  minimal 
elements  of  L(A).  Then  the  elements  of  m(A)  are  mutually  incomparable  and  since 
is  a  well  partial  ordering,  m(A)  is  finite.  Now  we  get: 

L(A)  =  {y|3xem(zl)  x=^y}  i.e.  L(A)  =  (y|Vxem(zl)  x=^y} 

Thus  the  level  of  A  is  completely  characterized  by  the  finite  set  m(A)  and  we  get 
the  next  theorem. 

Theorem  3.  There  are  only  countably  many  levels  of  knowledge  and  in  fact  all  of 
them  are  regular  subsets  of  £s  (or  Es). 

Proof.  Since  m(A)  is  finite,  a  finite  automaton  can  clearly  be  designed  to  test  whether 
x=^y  holds  for  some  element  x  of  m(A),  where  y  is  the  input.  The  fact  that  there  are 
only  countably  many  levels  of  knowledge  follows  immediately.  □ 
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COROLLARY  1 

The  membership  problem  for  a  level  of  knowledge  is  solvable  in  linear  time. 

For  a  set  L{A)  we  could  also  look  at  the  set  of  all  maximal  elements  of  L(A)  (with 
respect  to  =^).  Clearly  these  maximal  elements  are  also  mutually  incomparable. 
However,  the  set  L{A)  need  not  be  characterized  by  them  since  if  A  is  common 
knowledge  ( L(A )  =  Zs)  then  the  set  of  maximal  elements  is  empty.  The  set  of  maximal 
elements  is  also  empty  if  L(A)  —  As  where  A  is  a  proper  subset  of  Z.  Since  distinct 
sets  L(A )  can  have  the  same  maximal  elements,  maximal  elements  cannot  characterize 
L(A). 

However,  with  finite  levels,  maximal  elements  do  in  fact  characterize  L(A). 

Theorem  4.  If  L  is  a  non-empty  finite  subset  of  Zs,  then  L  is  downward  closed  iff  for 
some  k  and  xp.i  s^k  where  x;eZs, 

L=  U  dc({x;}) 

i  =  1 

Proof  Clearly  if  L  is  a  union  of  downward  closed  sets,  then  by  fact  4,  L  is  downward 
closed. 

Conversely,  let  L  =  {x1,...,xm}  be  a  finite  downward  closed  set. 

Consider  L'  =  uf=1dc({xi}).  Clearly  Ls  L'.  Let  xeL'.  Then  there  is  i  such  that 
xedc({xj).  Since  x;e  Land  Lis  downward  closed,  then  xe  L.  So  L  =  L'  and  therefore 
L'  is  the  required  representation  of  L.  □ 

Remark.  Note  that  we  could  have  taken  just  the  maximal  elements  of  L  and  they 
would  be  mutually  incomparable  in  that  case.  If  two  finite  sets  L{A),  L(A' )  have  the 
same  maximal  elements,  they  must  coincide. 

In  order  to  analyse  infinite  levels,  we  first  need  to  establish  some  properties  of 
downward  closed  sets.  The  following  theorem  generalizes  the  representation  from  the 
previous  theorem  to  the  case  of  infinite  levels.  It  will  be  used  in  obtaining  a  “normal 
form”  theorem  for  the  levels  of  knowledge. 

DEFINITION  6 

A  subset  L  of  Zs  is  star-linear  iff  there  exist  strings  x1,...,xm  +  1  and  subsets  Al9...,  Am 
of  Z  such  that  L  =  dc[({xt  })Af({x2})A|  •••  A*({xm+ 1 })]  nZs. 

Theorem  5.  If  L  is  a  subset  of  Zs,  then  L  is  downward  closed  iff  L  is  a  finite  union 
of  star-linear  sets. 

Proof. 

<= 

Star-linear  sets  are  downward  closed,  so  by  fact  4,  so  is  L. 

=> 

L  is  a  downward  closed  subset  of  Zs,  so  L  is  regular  and  L  is  the  language  accepted 
by  some  finite  automaton  M  (L—  L(M)). 


Levels  of  knowledge  in  distributed  systems 


175 


Let  S  be  the  set  of  all  states  of  M.  Let  sAf  iff  there  is  a  sequence  of  transitions 
from  the  state  s  to  the  state  t  labeled  by  the  symbols  forming  the  string  x.  We  define 
an  equivalence  relation  ~  on  S:  s  ~  t  iff  3x,y  sAr  and  tAs.  Intuitively,  two  states 
are  ~  -equivalent  iff  they  are  in  the  same  loop  in  M.  Now  we  define  a  new  automaton 
M',  whose  states  are  the  equivalence  classes  [s]  of  [s]  A[r]  if  there  exist  s'e[s] 
and  t'e[f]  such  that  s'  At'.  We  also  add  transitions  [s]  A[s]  if  3s',  s"e[s]  such  that 
s'  A  S"  in  M.  The  accepting  states  of  M'  are  the  equivalence  classes  of  accepting  states 
in  M,  and  the  initial  state  of  M'  is  the  equivalence  class  of  the  initial  state  qr  of  M. 

It  is  easy  to  see  that  L(M)  c  L(M').  We  will  show  that  L(M')  £  L(M). 

Let  xeL(M').  x  =  ala2---am.  [s1]^[s2]%[s3]---[sm]^[sm+1]  where  sm+1  is 
accepting  in  M  and  st  =  qx  is  the  initial  state. 

If  [Sl]^[s2],  there  must  be  states  s'x,  s2  such  that  s'jGfsj],  s'2e[s2]  and  sx  A s'2 

in  M. 

Let  yx,y2  be  strings  such  that  s1^>s'1,  s'2As2,  so  we  have  a  string  zx  =>'ia1y2  for 
which  sx  As2  in  M.  Repeating  this  procedure  we  can  find  strings  z2,...,zk  such  that 
si 14 s2 •••sm  Asm  + 1  and  for  all j  ^  k,  aj^fzj.  Therefore  z  =  z1z2---zme  L(M)  (sm+1 
is  accepting  in  M)  and  since  L(M)  is  downward  closed  and  x^z,xeL(M). 

We  have  shown  that  L(M)  =  L(M').  We  show  now  that  the  latter  has  the  required 
form. 

The  only  loops  in  M'  are  of  the  form  [s]  A  [s],  A  state  [s]  is  cycling  if  there  is  a  a 
such  that  [s]  A  [s]. 

Let  B(  be  a  sequence  [sx ] ax  [s2]a2  •••  [s,+  x]  from  the  initial  state  [sx]  =  [qx]  of  M' 
to  an  accepting  state  [t]  =  [s,  +  1],  where  the  aj  are  symbols  such  that  [s^]  A[sJ+1], 
and  [sj]  #  [sJ+p]  for  any  p  (there  are  no  loops  in  the  sequence).  With  this  sequence 
we  associate  a  star-linear  set  Sim(X )  =  dc[A\  ax  A2  •  •  •  a,  Asl  +  x  ]  n  where,  for  all  i,  if  [s,]  is 
non-cyclic,  then  A f  —  null  and  otherwise,  A,-  =  {cr |  [s;]  A  [sj }.  Then  the  set  of  non¬ 
repeating  strings  that  take  us  from  [sx]  to  [t]  is  exactly  X. 

If  we  take  the  union  of  all  such  sets  X  (there  are  finitely  many  of  them),  then  we 
get  the  characterization  of  L(M')  in  the  required  form.  □ 

We  have  found  a  representation  of  downward  closed  sets,  and  therefore  of  levels 
of  knowledge.  We  prove  now  that  there  is  a  unique  minimal  representation  of  such 
a  form. 

Theorem  6.  Every  star-linear  set  X  is  directed  with  respect  to  the  embeddability  relation 
In  other  words:  for  all  x',x"sX  there  is  xeX  such  that  x'=^x  and  x"^x. 

Proof.  Let  xl  =  y1v1—ymvmym  +  1  and  x"  =  z1w1---zmwmzm  +  1  where  yh  ziedc({xi}), 
and  vh  w.eA;  for  i=  1  Then  if  we  take  x  =  x1v1w1---xmvmwmxm+1,  then  clearly 

x'^x,  x"^x  and  of  course  xeX.  □ 

Theorem  7.  If  X  <=  ulJ=1  Y},  where  X,  Yj  are  all  star-linear,  then  there  exists  a  j  such 
that  X  £  Yj. 

Proof.  Suppose  that  X  c  u'  =  1  Yj  and  for  all  j,  X  Yj,  then  there  are  XjeX, 
j  —  1 ,...,/,  such  that  Xj<£  Yj.  By  theorem  6  applied  /—  1  times,  we  can  find  xeX  such 
that  for  all  j,  Xj^fx.  Since  all  Yj  are  downward  closed,  x^x  and  Xj<£  Yj,  so  x  is  not 
in  any  Yj.  Thus  x£u'-=1  Yj,  but  xeX  and  we  get  a  contradiction.  □ 
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Theorem  8.  If  L  is  downward  closed  then  L  has  a  unique  representation  as  a  finite 
minimal  union  of  star-linear  sets. 

Proof.  Clearly  we  have  a  minimal  representation  of  L,  L=  u£=  x  Xt  and  suppose  we 
have  some  other  representation  of  L  of  the  same  form  L—  u)=  i  Yj,  where  all  Xt  and 
Y-  s  are  star-linear  sets.  Then  for  every  i,  Xt  £  u'-=  x  Y-r  So  by  theorem  7,  for  every 
Xi  there  is  Yj  such  that  Xt  c  y..  Similarly,  by  applying  theorem  7  to  Yj  we  get  some 
m  such  that  Yj^Xm,  so  we  have  for  every  i,  some  j,m  such  that  Xt  <=  Yj^Xm.  By 
minimality  of  the  , k  representation  i  =  m,  so  we  have  X{  c  y.  c=  yr(.  and  hence 

Xt  =  y,-.  Similarly,  every  Yj  equals  some  Xt  and  the  two  representations  are  the  same. 

□ 


COROLLARY  (Normal  form  theorem) 

Every  level  of  knowledge  L  of  a  formula  A  in  a  distributed  system  has  a  unique 
representation  as  a  finite  minimal  union  of  star-linear  sets. 

2.5  Representation  of  levels  of  knowledge  using  minimal  strings  of  the  complement 

It  turns  out  that  the  representation  of  levels  of  knowledge  using  star-linear  sets  is 
convenient  if  we  want  to  realize  at  least  a  given  level  of  knowledge  and  if  we  have 
both  synchronous  and  asynchronous  communications  in  the  system.  It  is  not 
appropriate  if  we  want  to  realize  at  most  a  given  level  and  if  we  have  only  synchronous 
communications  in  the  system.  In  such  cases  the  following  theorem  is  useful: 

DEFINITION  7 

Given  strings  Xj,...^,  not  containing  repetitions,  let  N(xl,...,xk)  be  the  set 

{y\Vi^k,Xi=fcy}. 

Theorem  9 

(a)  N(xl,...,xk)  =  nNixf-.i^k. 

(b)  If  x  =  ax---am  then  N(x)  =  (Z  —  —  am)s. 

Proof.  The  first  part  is  obvious. 

To  see  (b)  we  use  induction  on  m. 

First  of  all,  it  is  clear  that  N(x)  includes  the  right  hand  side.  For  suppose 
ye(Z  —  ax  )s •■•(£  —  am)s.  If  y  has  no  ax,  then  it  is  certainly  in  N(x).  Else  the  first 
occurrence  of  an  ax  in  y  must  be  from  the  (Z  —  a2)s  at  the  earliest,  y  is  then  of  the 
form  yx  axy2  where  yx  contains  no  ax  and  y2  is  in  (Z  —  a2)s---(Z  —  am)s.  By  induction 
hypothesis  a2---am^y2  and  so  x=^y. 

Suppose  now  that  y  is  in  N(x).  We  want  to  show  that  ye(Z  —  ax)s ■■■(!.  —  am)s.  If 
m—  1  then  this  is  immediate.  Otherwise  we  know  that  xf^y. 

If  «!  then  ye(Z  —  nj5  and  hence  ye(I  —  ax)s”-(Z  —  am)s. 

If  a!  =^y  then  y  =  yxaxy2,  where  ai*fcy1  and  a2---am^y2.  In  that  case  y2e(Z  —  a2f--- 
(Z  —  am)s  by  the  induction  hypothesis  and  y^Z  —  ax)s.  Since  ax^a2,  the  result 
follows.  □ 
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3.  Model  of  a  distributed  system 

We  assume  that  there  are  a  finite  number  of  processes,  1 ,...,«,  which  compute  and 
communicate  with  each  other  either  by  asynchronous  messages  or  by  broadcasts. 
Our  network  is  assumed  to  be  fully  connected3  (there  is  a  channel  from  every  process 
to  every  other  process). 

Asynchronous  communication  consists  of  two  phases:  send  and  receive.  All  messages 
sent  are  ultimately  delivered  (and  in  the  order  in  which  they  were  sent)  but  the  delay 
(transmission  time)  may  be  arbitrarily  long. 

Broadcasts  are  fully  reliable,  synchronous  communications4  where  all  processes 
involved  simultaneously  receive  the  message  sent  by  one  of  them.  Our  broadcasts  are 
therefore  similar  to  CSP  communication,  but  allow  for  more  then  two  particpants  at 
a  time.  Later  on  we  will  point  out  a  limitation  of  CSP  that  follows  from  our  results. 

We  consider  three  kinds  of  systems  -  systems  where  only  asynchronous  communica¬ 
tion  is  available,  systems  where  only  synchronous  communication  is  available  and 
systems  where  both  kinds  of  communication  are  available. 

There  is  assumed  to  be  a  global  clock  in  the  background  which  orders  all  events, 
but  this  clock  is  not  assumed  to  be  available  to  the  processes.  Time  is  discrete,  the 
clock  is  set  to  0  at  the  beginning  of  every  computation  and  grows  in  increments  of  one. 

In  order  for  the  processors  to  have  something  to  communicate,  we  ensure  that  for 
every  processor  in  our  system  there  are  facts  known  initially  only  to  this  processor. 
To  accomplish  that  we  assume  that  every  process  starts  its  computation  with  some 
initial  value,  a  finite  string  of  0’s  and  l’s.  This  initial  value  may  correspond  to  some 
local  non-deterministic  input,  e.g.  the  result  of  a  sequence  of  coin  tosses  by  a  processor. 

At  each  moment  of  global  time,  zero  or  more  events  may  take  place,  at  most  one 
at  each  process.  This  finite  set  of  local  events  constitutes  a  global  event  and  a  global 
history  is  simply  a  possible  sequence  of  global  events.  What  global  histories  are 
possible  is  determined  by  the  programs  of  the  various  individual  processes  as  well  as 
by  the  properties  of  the  means  of  communication.  The  protocol  P  is  just  the  set  of 
all  possible  global  histories,  closed  under  the  prefix  operation. 

Different  models  of  distributed  systems  have  been  used  by  different  authors.  Our 
model  is  similar  to  that  used  by  Parikh  &  Ramanujam  (1985,  pp.  256-268),  Halpern  & 
Moses  (1990)  and  by  Chandy  &  Misra  (1986). 

There  are,  however,  some  small  differences:  we  assume  that  there  is  no  global  clock 
in  the  system  accessible  to  the  processes  (“common  clock”  in  Parikh  &  Ramanujam 
1985,  pp.  256-268).  Parikh  &  Ramanujam  (1985,  pp.  256-268)  do  not  have  this 


3  If  the  network  is  not  fully  connected  then  some  levels  of  knowledge  may  be  impossible  to 
realize  due  to  the  lack  of  communication  capabilities,  e.g.  if  a  processor  is  isolated  (cannot 
communicate  with  anyone)  then  the  other  processes  cannot  learn  anything  from  that  process. 
Interesting  questions  arise  in  case  of  a  directed  network  where  every  process  may  communicate 
with  every  other  process  but  some  communications  are  necessarily  indirect  (go  through  other 
processes).  We  will  not  analyse  this  case  here. 

4The  two  kinds  of  communications  can  be  thought  of  as  two  kinds  of  communication  media 
e.g.  mailing  system  (asynchronous)  and  telephone  lines  (synchronous).  Since  we  allow  for 
synchronous  communication  between  more  than  two  processes  at  a  time,  our  telephone  system 
must  have  “conference  call”  capability.  Note  that  the  ‘handshake’  in  CSP  can  be  thought  of  as 
a  special  case  of  a  broadcast  involving  exactly  two  processes. 
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restriction.  In  their  terminology,  our  local  histories  are  always  individual  views  of 
“time-free”  global  histories. 

Chandy  &  Misra  (1986)  do  not  allow  different  events  to  occur  locally  in  different 
sites  at  the  same  instant  of  time.  We  shall  not  impose  this  restriction.  This  can  be 
interpreted  in  two  ways:  either  we  can  treat  time  in  our  system  as  “less  refined”  than 
in  Chandy  &  Misra  (1986),  or  we  can  interpret  our  system  as  synchronous,  events  in 
all  sites  being  governed  by  the  same  global  clock.  Differences  in  local  time  will  then 
arise  solely  from  the  fact  that  a  process  may  not  have  an  event  happening  at  every 
moment  of  global  time.  If  a  process  is  inactive  in  a  certain  round  (its  action  is  the 
“null”  event),  it  cannot  perceive  that  time  has  passed.  Hence  even  while  assuming 
that  our  system  runs  in  synchronous  rounds,  we  ensure  that  processes  are  not  able 
to  record  global  time.  Consequently  they  cannot  draw  any  conclusions  about  the 
others  merely  by  observing  their  own  local  clock. 

There  are  some  terminological  differences  between  our  description  of  a  distributed 
system  and  that  of  Halpern  &  Moses.  Following  Parikh  &  Ramanujam  (1985, 
pp.  256-268),  we  shall  denote  the  set  of  all  global  histories  (runs  in  Halpern  &  Moses 
1990)  as  the  protocol.  Halpern  &  Moses  (1990)  use  the  word  protocol  to  describe 
the  rules  governing  actions  of  processors  (effectively  generating  the  set  of  runs). 

3.1  Definitions 

Here  we  formally  specify  our  class  of  models.  Let  N  =  { 1, . . . ,  n)  be  the  set  of  all  processors. 
Every  processor  i  has  infinitely  many  possible  initial  states  v  and  an  initial  state  is  a 
string  of  0’s  and  l’s  (ue{0, 1}*).  We  denote  the  set  of  initial  states  for  i  by  V-r  The  set 
of  global  initial  states  is  "V  =  n"=1  Vt. 

From  now  on  we  will  use  lower  case  letters  to  denote  everything  pertaining  to  a 
single  process.  Capitals  without  subscripts  will  be  used  where  all  the  processes  are 
involved  (e.g.  vt  is  an  initial  state  of  a  processor  ;,  while  V  is  an  initial  configuration 
of  the  whole  system:  V  =  u„)).  tC  is  the  set  of  all  such  V.  vh  initial  input  of  the 

processor  i  provides  interpretation  for  private  facts  in  our  logic,  i.e.  Qt  m  is  true  iff 
the  mth  bit  of  vt  is  1.  The  mth  private  fact  Pi  m  of  a  processor  i  is  whether  Qi  m  or  its 
negation  is  true. 

Events.  denotes  the  set  of  all  events  in  which  processor  i  can  participate  (events 
local  to  i).  There  are  the  following  types  of  events  (or  actions): 

(1)  £;:  Local  computation  steps.  (We  assume  that  for  i  ^  j,  Ltn  Lj  =  </>) 

(2)  s(i,j,m ):  Sending  a  message  m  to  a  processor  j,jeN. 

(3)  r(i,j,m ):  Receiving  a  message  m  from  a  processor  j,jeN. 

(4)  bc(i,  U,m ):  Sending  a  broadcast  m  to  a  group  of  processors  U,ieU  c  jv.  The  same 
event  is  also  in  Ej  for  all  j  in  U. 

£;  =  £;u  {s{i,j,m)\meM,jeN}  u  {r{i,jym)\meM,jeN} 

u  {bc(j,  U,m)\meM,  i,jeU  £  N}  u  {bc(i,  U,m)\meM,  ieU  c;  TV} 

Note  that  the  last  two  items  can  be  combined  into  one  as  {bc(j,  U,  m)|meM,  ieU  c  N}. 
M  is  the  set  of  messages,  defined  below. 

We  define  the  set  of  global  events  G  in  our  system.  G  c  n?=1(£fu  {null})  (a 
cartesian  product)  such  that  if  (el5.  ..,eh. .  ,,en)eG  for  some  i  and  et  =  bc(j,  U,m)  then 
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for  all  i'eU,  er  =  bc(j,U,m).  If  et  —  null  for  some  i,  it  means  that  there  is  no  local 
event  at  i  at  this  point.  Note  that  null  is  not  local  to  any  process.  We  use  the  notation 
(G)j  to  denote  the  ith  coordinate  of  G,  so  (ex , . . . , eh . . . , en);  =  ei,...,ei. 


Histories.  A  history  (a  run)  is  an  input  value  followed  by  a  sequence  of  events.  The 
set  of  all  possible  histories  of  the  system  will  be  called  the  protocol  P.  So  P  £  iG;  G*. 
Protocols  are  always  closed  under  taking  an  initial  segment  of  a  history:  He P  implies 
that  every  H'  which  is  an  initial  segment  of  H  is  in  P. 

We  will  require  that  for  every  receive  in  every  history  in  every  protocol,  there  is 
exactly  one  corresponding  send,  which  occurs  before  the  receive  (this  condition  will 
be  called  time-consistency). 

We  say  that  two  histories  H  and  H'  are  compatible  iff  they  start  with  the  same 
input  values  (initial  states). 

We  define  the  concatenation  of  compatible  histories: 

If  Hx  =  V;  G1;.. .;  Gk,  and  H2  =  V;  G\;...;G'h  then  the  concatenation  of  H1  and  H2 
is  the  history  H  —  V;Gl;...;Gk,G\;...;G'l. 

Local  histories  are  the  projections  of  global  histories  onto  the  sets  of  local  events 
of  the  processors.  They  are  “time-forgetting”,  i.e.  they  erase  null  events. 

We  assume  that  a  global  event  -  the  ticking  of  the  clock  -  takes  place  even  if  no 
local  events  take  place  at  a  particular  moment  (this  corresponds  to  events  of  the  form 
(null, . . . ,  null)),  and  so  the  length  of  a  global  history  is  just  the  amount  of  time  elapsed. 
However,  what  each  process  sees  at  any  moment  of  time  is  its  local  event,  if  any,  and 
its  local  history  is  simply  the  sequence  of  local  events.  Given  i,  and  the  global  history 
H,  the  local  history  /i,  is  uniquely  defined  and  we  let  G>;  be  the  map  which  takes  us 
from  H  to  ht. 

We  can  inductively  define  <t>,’s  as  follows: 


G>;((g ,  •  •  •  ,v„))  =  (G  , . . .  ,v„)i  =  vh 

G>* (H);ei ,  iff  (G)i  = 

<D  ,(//),  iff  (G);  =  null. 


0\(//;G)  = 


So  G>f  is  like  (•),-  (when  we  extend  the  projection  operation  from  events  to  histories) 
except  that  it  erases  null  events. 

Its  local  history  is  all  that  a  processor  sees,  so  all  global  histories  which  correspond 
to  the  same  local  history  ht  look  the  same  to  the  processor  i.  Note  that  the  length  of 
Q>i(H)  is  less  than  or  equal  to  the  length  of  H.  In  fact  length  (G >,(H))  =  length(H)  iff 
there  are  no  null  events  on  i  in  H. 

For  every  i  we  can  define  an  equivalence  relation  on  the  set  of  global  histories: 
H^iH'  iff  G>,-(H)  = 

If  U  is  a  subset  of  N,  then  we  let  be  the  reflexive  transitive  closure  of  u  xpAeU. 

We  use  capital  letters  to  denote  global  histories,  events  etc.,  lower  case  letters  to 
denote  local  histories,  events  etc. 


Time.  The  global  time  of  an  event  G  in  a  global  history  If  Time  (G,  H),  is  the  length 
of  the  initial  segment  of  H  up  to  and  including  G,  the  time  when  the  event  G  has 
occurred  in  the  history  H. 

Note  that  since  null  <£Et  for  all  i,  processes  do  not  have  access  to  the  global  clock. 
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They  can  introduce  their  own  local  logical  clocks,  but  these  clocks  do  not  have  to 
coincide.  Local  time  of  e,  in  can  be  defined  as  time  -  the  length  of  ht  up  to 

and  including  e{. 

The  lack  of  the  access  to  the  global  clock,  together  with  the  closure  conditions  for 
the  protocol  guarantee  that  the  only  possible  “causality”  ordering  that  can  be  defined 
corresponds  in  case  of  asynchronous  systems  to  Lamport’s  (1978)  “happened  before” 
ordering5. 

Messages.  Messages  m  are  knowledge  formulae  (formulae  of  the  form  xA  where  x 
is  a  string  of  knowledge  operators  KhKi2-- Kin  and  A  is  some  boolean  combination 
of  P1 , ... ,  Pn).  We  assume  that  the  processes  are  “honest”,  they  only  send  (or  broadcast) 
messages  which  they  know  are  true.  Formally,  if  ®,(H)  —  /i;;s(i,;,  B)  or  <!>,(//)  =  ht; 
bc{i,U,B)  and  He P,  then  if  <!>;(//')  =  /i,  then  Later  on  we  will  also  allow 

certain  ‘common  knowledge’  symbols  Cv  to  occur  in  addition  to  the  Kt. 

We  remark  that  the  notion  of  a  message  and  that  of  a  history  should  really  be 
defined  via  mutual  recursion  since  the  truth  of  a  message  depends  on  the  histories 
that  are  possible  and  the  histories  that  are  possible  can  contain  only  true  messages. 
However,  for  the  messages  of  the  type  we  have  described,  the  truth  of  a  message 
depends  only  on  the  portion  of  the  history  that  has  already  elapsed.  Hence  the 
recursion  is  allowable.  If  the  messages  were  to  contain  future  modalities,  then  there 
could  be  problems,  which  arise  due  to  circularity. 

We  illustrate  our  semantics  by  means  of  an  example.  Suppose  that  for  i  ^  3,  P,  is 
a  private  fact  of  process  i.  Suppose  also  that  in  a  history  H,Pt  are  all  true.  Now 
process  3  receives  a  message  from  process  2  that  Pl-*P3.  3  knows  that  the  message 
is  correct  because  P3  is  in  fact  true.  However,  3  has  never  sent  any  messages  to  anyone 
at  all.  Now  3  can  reason,  “2  cannot  know  that  P3  is  true  so  2  must  know  that  P.  is 
false.”  We  show  how  to  do  this  argument  in  our  framework.  If  we  replace  H  by  H' 
where  P3  is  false,  but  otherwise  the  same  as  H ,  then  H  m2H'.  Now  2  must  have  sent 
in  H  only  a  message  that  he  knew  to  be  true,  so  we  must  also  have  //'f=X2(P1  ->P3). 
Hence  H' ^ (P x ->■  P 3)  so  that  H'\z~iPl.  Since  we  only  used  the  fact  that  Hze2H',  we 
get  H K 2 (  i  Pi).  However,  this  argument  could  be  applied  to  any  H"  &3H  so  that 
we  get  HVK3(K2{— \P1)). 

Closure  conditions  for  the  protocol.  We  impose  some  additional  conditions  on  the 
protocol  P.  We  want  to  ensure  that  the  initial  state  of  i,  ( v{ )  cannot  be  known  to  any 
other  process  j  at  any  run  of  the  system,  unless  j  learns  about  vt  from  some 
communication.  We  want  to  exclude  the  possibility  that  something  may  be  known 
“accidentally”.  To  achieve  that  we  will  make  sure  that  all  initial  states  are  possible. 
Moreover,  if  vt  is  the  initial  state  of  i,  all  other  strings  v\  will  remain  possible  for  j  as 
initial  states  of  i,  unless  j  gets  some  message  to  the  contrary  (directly  from  i  or  via 
some  other  processors). 

(1)  All  vectors  of  input  values  are  possible:  V  V  such  that  V=(v1,...,vn)  where  every 
Vi  is  a  sequence  of  0’s  and  l’s,  there  is  some  He P  such  that  for  some  H\  H  =  V\H' 
(note  that  H1  cannot  itself  be  a  history  since  it  lacks  an  input  value). 


5  £  j  — *  e2  iff  ex  is  send,  e2  is  receive  of  the  same  message  or  ex,e2  are  local  to  the  same  process 
and  e,  occurred  earlier  than  e2.  ex  happened  before  e2  iff  ex  -**e2  where  ->*  is  the  reflexive, 
transitive  closure  of  In  a  system  where  we  allow  broadcast  one  more  condition  is  needed 
in  the  definition  of  ->.  ex  -+e2  if  ex  and  e2  are  two  local  projections  of  the  same  broadcast. 
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(2)  No  sequence  of  local  events  on  some  group  of  processes  can  influence  possible 
actions  of  some  other  group  of  processes  unless  there  are  some  communications 
(assuming  that  both  groups  are  disjoint). 

For  that  we  need  some  closure  conditions  on  the  set  of  all  protocols.  The  first 
condition  we  use  is  due  to  Chandy  &  Misra  (1986)  (it  is  the  first  of  their  principles 
of  computation  extension). 

We  need  one  definition: 

Let  G  =  (e1,...,e„),  G  is  on  U  if  U  —  {i|(G);  ^  null}  (so  U  is  the  set  of  processes  which 
have  some  local  events  in  G). 

Closure  conditions: 

(i)  Extension  rule:  Suppose  that  VieG,  G  is  on  U,  and  none  of  (G)f  is  a 

receive  r(i,j,  m)  for  any  j  not  in  U.  Then 

H'e P,  H;  GeP=>ZT;  GeP. 

The  extension  rule  guarantees  that  if  we  have  a  protocol  P,  some  history  H  in  P 
and  some  action  of  a  group  of  processes  U  is  possible  in  H,  then  the  same  action 
must  be  possible  in  every  history  H'  which  looks  the  same  to  all  processes  in  U  unless 
it  violates  time-consistency.  In  order  to  see  why  et  cannot  be  allowed  to  be  a  receive 
from  a  processor  outside  U,  let  us  look  at  an  example: 

Let  N  =  {1,2,3},  U  =  {1,2}. 

H  —  V;  (null,  null,  s(3, 1,  m )),  H'  =  V;  (null,  null,  null)  for  some  input  V.  Clearly  H  x  t  H' 
and  H^2H’.  If  we  take  G  =  (r  (1,3,  m),  null,  null)  such  that  H;Ge  P  then  requiring 
H';G  to  be  in  P  would  violate  time-consistency. 

The  following  conditions  assure  that  no  process  can  get  any  additional  information 
about  the  other  processes  by  observing  its  own  local  events  (no  hidden  synchronization). 
These  conditions  are  necessary  because  (unlike  Chandy  &  Misra  1986)  we  allow  local 
events  at  different  sites  at  the  same  instant  of  time.  Condition  (ii)  says  that  if  some 
local  events  have  occurred  in  parallel,  and  the  sets  of  participating  processes  were 
disjoint,  they  could  have  occurred  in  sequence.  We’ll  call  it  the  splitting  rule. 

(ii)  Splitting  rule:  G  =  (e1,...,en),  G£V,  G  is  on  U.  Given  UX,U2  such  that 
Gj  u  U2  =  U  and  L\,U2  disjoint,  then  we  can  “split”  G  into  Gx  and  G2: 

H;GeP=>H;Gl;G2eP, 

where  (G),  =  (G^  for  ie.Ult  (G)t  =  (G2)f  for  ieU2,  (G^j  =  null  =  (G2)k  for  j$Uu  k$U2 
provided  that  we  don’t  split  any  broadcasts:  (G);  =  bc(i,  K,  m)->  V  £  U  {  V  fc  JJ2. 

Condition  (iii)  says  that  if  some  local  events  have  occurred  in  sequence,  the  sets  of 
participating  processes  were  disjoint,  and  there  was  no  send  receive  pair  in  them,  they 
could  have  occurred  in  parallel. 

(iii)  Joining  rule:  Given  U1,U2  such  that  G,uG2  =  G  and  U1,U2  disjoint,  let  Gj 
be  on  Gj ,  G2  on  U2,  and  if  there  are  no  i,j  such  that  (G2 );  =  s(i,j,  m)  and  (G2)j  =  r(j,  i,  m). 

H;Gx;G2eP^H;GeP, 

where  (G),  =  (G1)i  for  ieUu  (G)i  =  (G2)i  for  ieU2. 
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Systems.  We  consider  three  kinds  of  systems.  Asynchronous  systems  are  the  systems 
as  described  above  but  without  broadcasts.  The  only  communications  are  via  send 
and  receive.  Synchronous  systems  are  systems  in  which  all  communications  are  done 
using  broadcasts  and  we  do  not  have  the  events  send  and  receive.  Finally,  we  use  the 
name  mixed  communications  systems  for  systems  with  both  kinds  of  communications 
available. 


4.  Realization  of  levels  of  knowledge  in  distributed  systems 

Since  we  know  now  that  every  level  of  knowledge  is  a  downward  closed  set  of  strings, 
we  can  ask  whether  given  a  downward  closed  set  of  strings  L  we  can  find  some  formula 
A  and  some  run  of  a  distributed  system  (history  H)  such  that  L  =  L(A,  H).  The  answer, 
it  turns  out,  depends  on  the  kind  of  communication  available  in  the  system.  In  a 
system  with  unreliable  delivery  time  (asynchronous  system)  we  are  able  to  realize 
only  finite  levels  of  knowledge  (this  generalizes  the  result  of  Halpern  &  Moses  (1990) 
that  no  common  knowledge  can  be  achieved  in  an  asynchronous  system).  In  systems 
where  all  communications  are  instantaneous  broadcasts  with  at  least  3  processors  - 
synchronous  systems  -  all  levels  of  knowledge  can  be  realized.  If  there  are  only  2 
processors,  and  broadcasts  are  the  only  medium  of  communication,  then  the  finite 
levels  containing  strings  longer  then  1  cannot  be  realized.  In  the  full  systems,  where 
there  are  two  communication  media:  synchronous  broadcast,  and  asynchronous  send 
and  receive,  all  levels  can  be  realized. 

So  in  the  following  sections  we  separately  analyse  these  three  kinds  of  systems: 

(1)  Systems  where  only  asynchronous  communications  are  available. 

(2)  Systems  where  only  synchronous  communications  are  available. 

(3)  Systems  where  both  synchronous  and  asynchronous  communications  are  available. 

Before  we  proceed  with  realizing  levels  of  knowledge,  we  say  something  about  the 
properties  of  the  formulae,  which  we  will  look  at. 

DEFINITION  8 

A  formula  A  is  persistent  if  whenever  HVA  and  H'  extends  H,  then  H'VA. 

Theorem  10.  If  A  is  persistent  then  so  is  KfA)  for  any  i. 

Proof.  Suppose  HtKfA)  and  H'  extends  H.  Suppose  that  for  some  H",  <!>,(//") 
equals  O, (Hr).  Then  for  some  initial  segment  Ht  of  H",  O ,(//,)  equals  0,-(H).  Hence 
Ht  satisfies  A  and  by  the  persistence  of  A,  so  does  H".  Since  H "  was  arbitrary  with 
$,.(//" )  =  H’  satisfies  KfA).  □ 

Theorem  11.  Every  formula  A  which  is  a  boolean  combination  of  P^s  is  persistent.  □ 

COROLLARY  1 


Every  formula  of  the  form  xA,  where  A  is  a  boolean  combination  of  P^s,  and  x  is  a 
string  of  knowledge  operators,  is  persistent.  □ 
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From  now  on  we  will  look  only  at  persistent  formulae  of  the  form  as  in 
corollary  1  after  theorem  11. 

4.1  Asynchronous  case 

We  now  look  at  the  question  of  the  levels  of  knowledge  in  asynchronous  systems. 
The  only  possible  communications  are  via  send  and  receive,  where  the  arrival  time 
of  the  message  is  not  guaranteed.  In  the  formal  model  we  exclude  broadcasts,  so  no 
event  is  local  to  more  then  one  processor. 

What  are  the  consequences  of  the  fact  that  messages,  although  eventually  delivered, 
may  remain  in  the  mail  for  an  unbounded  amount  of  time?  Suppose  that  a  process 
i  sends  a  message  to  a  process  j  that  a  certain  formula  A  (whose  truth  value  is  invariant 
over  time)  is  true.  Then  when  process  j  receives  the  message,  it  knows  A  and  that  i 
knows  A.  Thus  Kj{Kt{A))  is  true.  However,  i  does  not  know  that  j  has  received  the 
message  and  hence  KfK^K^A)))  does  not  hold  until  i  receives  an  acknowledgement 
from  j. 

Let’s  show  that  this  fact  is  valid  in  our  model.  We’ll  take  only  a  2  processor  system, 
the  processors  are  1  and  2.  Let  I  be  an  initial  configuration  of  the  system  in  which 
the  first  bit  of  an  input  of  the  first  processor  is  1  (and  this  single  bit  is  the  whole  input 
of  1).  So  I  =  {vx,v2),  t>i  =  1.  Let  P,  say  that  the  input  value  of  the  processor  is  1,  so 
P t  =  QiA-  Slightly  abusing  notation  we  will  write  that  7NP x,  instead  of  I\zPl  —  Qitl. 
Px  is  private  to  1 .  Now  let  H  be  a  history  which  starts  in  the  initial  configuration  /  and 
in  which  1  sends  a  message  to  2  informing  him  that  Pl  and  this  message  is  received 
by  2  in  the  next  instant  of  time.  H  —  /;  (s(l,2, Pt), null);  {null,r{2,  l,Px)),  and 
H\=  K2K1Pl. 

Let  H'  be  a  history  in  which  the  same  message  is  sent,  but  it  is  delayed,  it  is  not 
received  by  2  in  the  second  instant  of  time:  H'  —  I;  (s(l,2 ,Px),null);  {null,  null). 
Because  of  the  extension  rule  H'  is  in  the  protocol. 

Clearly  Hx1H'. 

Let  H"  be  a  history  with  the  same  input  configuration,  but  in  which  1  is  slow  and 
hasn’t  sent  anything  yet:  H"  —  I;  {null,  null);  {null,  null)  (again,  because  of  the 
extension  rule,  H"  is  in  the  protocol). 

H"&2H’. 

Let  H"'  =  Ix;  where  lx  is  an  initial  configuration  in  which  2  receives  the  same  input 
as  in  H,  but  l’s  input  is  0  instead  of  1.  H’"  is  in  the  protocol  because  of  the  first 
closure  condition  (all  inputs  are  possible), 

H"’  ~2H",  and  we  get: 

H"’  k2H"  k2H'  kxH,  so  H"'  &2H'  xxH.  Since  H"JP1  then  H\fKxK2Px,  hence 
H)KlK2KlPl. 

It  seems  reasonable  to  suppose  that  a  back  and  forth  interchange  of  messages  will 
only  make  a  finite  amount  of  difference  and  we  proceed  to  show  now  that  this  is 
indeed  the  case. 

Theorem  12.  Let  A  be  a  formula  of  the  form  xB,  where  x  is  a  string  of  knowledge 
operators  and  B  is  a  boolean  combination  of  private  facts  P,.  Let  H\fKjA.  Then  if 
H"  —  H;  H'  and  H"  N  KjA,  then  there  is  a  receive  in  Q>j{H')  {either  of  the  form  of  r{j,  l,  m) 
for  some  l,  or  bc{l,  U,m)  where  jeU,  l  ^  j).  Informally:  a  process  may  learn  something 
about  the  others  only  when  it  receives  a  message. 
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Proof.  Proof  by  induction  on  the  length  of  H'.  If  the  length  of  H'  is  0  then  the 
theorem  clearly  holds.  So  suppose  that  H’  is  H'0\G.  If  H;H'0f  KjA  then  there  is  some 
Hl  2ZjH\H’0  such  that  HyfA.  By  the  splitting  rule  if  G  is  not  a  broadcast,  we  can 
split  it:  there  are  G2,G1  such  that  H;H'0;G2;G1  is  in  P  where  (GJ;  =  {G)j,  ( G j),-  =  null 
for  i  *j;  (G2),  -  (G),-,  for  i  (G2)j  =  null.  H;  H’0;  G2  H;  H’0,  so  Hx  H; Hf  G2.  By 
the  extension  rule  H1;Gl  must  be  in  P  (Gx  was  not  a  receive!).  Furthermore 
H Gl  w  j  H\  H’.  If  x  was  empty  (A  is  a  boolean  combination  of  private  facts),  then  if 
HifA,  then  H1\G1fA.  If  x  is  Ksy,  then  by  the  induction  hypothesis  since  s  has 
not  received  any  message  in  G  x  he  didn’t  learn  anything,  so  if  H  x  A,  then  H1;G1fA. 
So  in  any  case  H\  H '  —  H"  f  KjA.  □ 

As  a  consequence  we  get  the  following  theorem,  which  is  essentially  Chandy  & 
Misra’s  (1986)  theorem  5. 

Theorem  13.  (Chandy  &  Misra  1986)  If  for  some  histories  H,  H '  such  that  H  is  an 
initial  segment  of  H'\ 

H'¥Ki(r)KiW-KmA  and  H\fKi{p)A 

then  in  H'  —  H  there  must  be  a  sequence  of  messages:  mp^2,  mp_2, • . . ,m1  such  that 
mp_l  is  sent  by  i{p)  and  reaches  i(p  —  1)  (maybe  via  some  other  processes), ...,mi{l)  is 
sent  by  i( 2)  and  ( maybe  indirectly)  reaches  t(l)  (the  messages  may  be  different  but  they 
must  all  imply  A).  Moreover  if  A  doesn’t  depend  on  any  local  event  of  i(p)  (its  truth 
value  depends  on  some  event  e$Ei(p))  then  there  must  be  some  event  of  the  form  r(i(p),  l,  m) 
occurring  after  H  but  before  s(i(p),i(p  —  1),  mp_x). 

Proof.  By  induction  on  p  using  the  previous  theorem. 

Theorem  14.  Suppose  communication  is  asynchronous  and  A  is  a  persistent  formula. 
Let  H  and  H'  be  global  histories  such  that  H'  extends  H.  Then  L(A,  H  )  includes  L(A,  H) 
and  there  is  a  finite  X  such  that  L(A,H')^dc(X\  L(A,H)). 

Proof.  The  fact  that  the  level  increases  with  time  follows  from  the  fact  that  we  deal 
with  persistent  formulae. 

For  the  second  part  we  use  the  theorem  of  Chandy  &  Misra  (1986). 

Since  there  are  only  finitely  many  events  G  between  H  and  H',  there  are  only  finitely 
many  possible  sequences  of  events  as  above  and  a  finite  set  of  strings  x  for  which  the 
conditions  of  theorem  13  are  satisfied.  Let  X  be  the  set  of  these  strings,  then  X 
satisfies  the  conditions  of  the  theorem.  □ 

Now  we  characterize  precisely  how  the  level  of  knowledge  grows  when  a  message 
is  received. 

Theorem  15.  Suppose  that  a  history  H  is  exactly  the  following  sequence  of  messages: 

®0'm>  im—  1 »  ^m)»  1  ’  hn’  ^m) 

S(im—  1 »  9n  -  2’  ^i(m)^m  )•>  ^O'm  —  2’  9n  -  1 »  ^i(m)  )••• 

s(i2,  ii ,  Kn 3)  •••  Ki(m)Km);  r(il ,  i2,  K^y  - Knm^Rm) 
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{For  simplicity  we  have  left  out  the  null  events  of  processors),  where  Rm  is  initially  {in 
the  empty  history)  known  only  to  im;  then 

L{Rm,H)  =  dc{Ki(l)  Ku2)“'Knm)}- 

Proof.  It  is  easy  to  notice  that  dc{Kin)Ki(2)  £  L{Rm,H).  The  other  inclusion 

we  prove  by  induction  on  the  length  of  H.  If  it  is  0  then  H  is  empty  so  Rm  is  known 
only  to  im.  So  assume  that  the  theorem  is  true  for  histories  of  length  up  to  m  —  1. 
Let  us  assume  then  that  there  is  some  y  such  that  H\zyRm  and 

y£dc{Ki(l)Ki(  2)---Xi(m)}. 

So  y  must  be  non-empty.  Let  then  y  =  IC,/.  If  /  was  not  mentioned  in  the  sequence 
of  messages  (/  was  not  any  of  -  then  l  would  have  known  Rm  initially  (it 

hasn’t  received  any  messages,  see  theorem  12)  -  which  contradicts  our  assumption. 
So  it  must  be  that  / e{il5 . . . ,  im}.  Then  we  can  express  Ki(1)Ki(2)--- Ki(m)  as  zK,z'  where 
the  K,  picked  is  the  leftmost  occurrence  of  K,  in  Ki(l)Ki{2)  Let’s  define  H'  as 

the  initial  segment  of  H  up  to  the  last  receive  by  l.  I  doesn’t  learn  anything  in  H  —  H' 
(he  doesn’t  receive  any  messages),  so  if  H¥Kly'Rm  then  H'¥  K,y’ Rm.  H1  =  H";  G  where 
(G)i  =  r{l,s,m).  By  the  splitting  rule  we  can  split  G  into  Gx  and  G2,  where  Gx  is  a 
receive  on  /,  G2  is  null  on  /.  H"\  G1;G2  must  be  in  the  protocol.  H'm,  H";  Gx,  so 
H"\=y'Rm.  Then  by  the  induction  hypothesis  y'edc{{z'}),  so  y' z'  and  therefore 
y<Ki(l)Ki(2)---Ki(m).  □ 

Theorem  16.  (a)  Suppose  H  realizes  L(A,H)  {for  some  formula  A  which  is  a  boolean 

combination  of  Pf  and  H^H'6,  i.e.  all  messages  in  H  occur  in  H’  and  in  the  same 
order,  then  H’  realizes  at  least  L{A,  H). 

(b)  Suppose  H  realizes  L{A,H)  and  H'  realizes  L{A,H')  Suppose  that  H"  is  the 
concatenation  of  H  and  H\  Then: 

L{A,  H")  =  L{A,H)u  L{A,H') 

Proof  Part  (a)  follows  easily  by  induction  on  the  length  of  H  using  the  fact  that  A 
is  persistent. 

To  see  part  (b),  clearly  L{A,H)u  L{A,H ')£  L{A,H"). 

Suppose  H"tyA.  If  y  is  empty,  then  we  are  done,  since  H¥A. 

Otherwise  let  yA  be  true  in  H",  where  y  is  (say)  Kxy'.  Suppose  that  Kxy'£  L(A,H). 
Then  there  must  be  some  point  T  such  that  H;  K{y' A,H\HTf  Kly' A  (for  all 
Ht ■  such  that  1 HT.  ^  HT  ^  IT).  By  theorem  12  HT  has  as  its  last  event  some  G  such 
that  (G)x  =r{l,l,zA).  So  /^X,z.  But  the  event  G  was  a  part  of  H'.  Message  sent 
and  received  in  H'  must  have  been  true  in  H'.  So  HT\zKly' A,  so  H'tyA.  □ 

Note  that  H  and  H'  could  be  executed  in  parallel.  In  fact  we  can  take  any  minimal 
H"  such  that  both  H  and  H '  are  embeddable  in  H". 

We  now  show  that  all  finite  downward  closed  sets  are  actually  attainable  as 
knowledge  levels  L{A)  of  formulas  in  asynchronous  systems. 

Theorem  17.  Every  finite  downward  closed  set  is  the  set  L(A,H)  for  an  appropriate 
A  and  H  in  some  asynchronous  protocol. 


6  ^is  here  embeddability  relation  defined  for  histories. 
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Construction.  Let  X  be  a  finite  downward  closed  set  of  strings  from  £s.  We  construct 
a  formula  A  and  a  history  H  such  that  L(A,H)  =  X. 

(1)  If  X  is  empty  then  A  is  the  formula  false. 

(2)  If  X  consists  of  the  empty  string,  then  A  is  the  conjunction  Pj  and  P2  where: 
Pj  is  a  predicate  whose  truth  value  is  initially  known  oniy  to  process  1,  P2  is  a 
predicate  whose  truth  value  is  initially  known  only  to  process  2,  and  H  is  the  empty 
history. 

(3)  Otherwise  X  has  nonempty  strings  and  therefore  the  set  of  all  maximal  strings 
M  in  X  is  nonempty.  Let  M  =  xl,x2,...,xr.  Let  x,  =  Ki(ki)-- -Ki{2)Kia)  for  i  =  l,...,r. 
Then  we  take  A  to  be  a  V -  =  1Pi(1). 

Let 


Hi  =  s(i{  1),  f(2),  A);  r(i( 2),  i(l),  A);  s(i( 2),  i(3),  Ki(l)A); 

■■■s(i(k  —  1),  i(k),  Ki(k_2)  ■  ■  ■  Ki{2)KiwA)- 
rm,i(k-l),KHk„2).--Ki(2)KiWA). 

Clearly  L(A,Ht)  =  dc(x;).  Now  by  theorem  16  if  H  —  Hl4,H2;...Hr  then  L(A,H)  — 
uri=1dc(Xi). 

Note  that  H  could  be  any  permutation  of  H- s.  In  fact  all  H(  could  be  executed  in 
parallel.  □ 

4.2  Synchronous  case 

In  this  section  we  will  augment  the  alphabet  £  to  a  larger  alphabet  Ec  which  includes 
symbols  Cv  where  U  ^N.  Semantically,  the  symbol  Cv  denotes  the  set  {K,|iel/}s 
and  will  be  referred  to  as  the  common  knowledge  of  processors  in  U.  However,  the 
element  Cv  of  £c  is  not  the  infinite  set  of  all  strings  in  the  denotation  of  Cv,  but  the 
string  Cv  where  U  is  explicitly  enumerated.  The  Cv  as  part  of  a  message  is  a  finite 
string  which  denotes  an  infinite  regular  set  of  strings. 

Levels  of  knowledge  will  continue  to  be  identified  with  subsets  of  £s.  Thus,  for 
example,  the  string  Cx  2C2 >3  denotes,  as  a  subset  of  Zs,  the  set  of  all  simple  strings 
consisting  of  any  number  of  Kx  and  K2  followed  by  any  number  of  K2  and  K3. 

We  start  with  a  preliminary  result. 

Theorem  18.  The  set  L(A)  is  infinite  iff  it  includes  common  knowledge  of  A  between 
two  distinct  processes  i  and  j. 

Proof  One  direction  is  clear,  as  common  knowledge  between  i  and  j  includes  all 
strings  in  {Kh  Kj}5. 

Conversely,  suppose  that  L(A)  does  not  include  common  knowledge  of  any  two 
distinct  i,j.  Then,  since  L(A)  is  downward  closed,  for  any  such  i,j,  there  must  be  a 
maximum  Alt(i,j)  number  of  alternations  between  Kt  and  Kj  in  any  string  in  L(A). 
Let  mA  be  the  largest  of  these  Alt(i,j)  and  let  p  be  mA*n2  (n  is  the  number 
of  processors).  Now  any  nonrepeating  string  of  length  greater  than  p  +  1  has  at  least 
p  +  1  alternations,  and  hence  more  than  mA  for  some  specific  alternation  between 
some  Ki  and  Kj.  Thus  no  string  in  L(A)  can  have  length  greater  then  p  +  1  and  L(A) 
is  finite.  □ 

In  the  following  theorem  we  recall  that  Cv  stands  semantically  for  the  set  {Ki\ieU}s. 
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Theorem  19.  If  we  broadcast  “xA”  where  x  is  a  string  of  knowledge  operators  and  A 
is  a  booleari  combination  of  propositions  P,  among  the  group  of  processes  U,  then  the 
created  level  of  knowledge  of  A  increases  by  the  downward  closure  of  {Cv}.  Formally: 
If  H  =  H'\  G,  VjeU  ( G)j  =  bc(i ,  U,xA),  Vj'$U  (G)r  =  null,  then: 

L(A,  H)  =  L{A,H')udc({Cv})dc({x}). 

Proof  Let  H  be  as  in  the  premises  of  this  theorem.  Then  H'vA  (our  processes  are 
honest  and  A  may  be  broadcasted).  Moreover  since  it  is  a  property  of  all  histories 
in  the  protocol,  it  is  common  knowledge  that  formulae  sent  (or  broadcasted)  are  true 
and  remain  true  (all  formulae  persistent): 

l=CjV((H  =  H';G;  H"  and  (G)j  =  bc(i,  U,xA))^  HtxA), 

this  implies  that  for  every  U  ^  N: 

I -  CV{{H  =  H'-,G,H"  and  ( G)j  =  bc(i ,  U,xA))->HtxA), 

so  in  order  to  prove  that  H^CyxA  it  is  enough  (since  common  knowledge  is  closed 
under  modus  ponens)  to  show  that  H h  CV(H  —  H';G;  H")  and  (G)7  =  bc{i,  U, xA).  But 
it  is  a  property  of  P  that  if  for  some  jeU,  ( G)j  =  bc(i,  U,  B )  then  for  all  jeU,  ( G)j  = 
bc{i,  U,  B).  So  VJf  o  kv*H(H  =  H';  G;  H"  and  (G)j  =  bc(i,  U,xA))~*  (H0  =  H'0;G ;  Hq  and 
(G)j  =  bc(i,U,xA)).  Therefore  HVCuxA  so  H¥ dc({Cv})dc({x})A,  H^dc({C,jx})A. 
The  theorem  follows  since  no  process  outside  of  U  received  any  message.  □ 

We  now  prove  that  in  the  presence  of  synchronous  communication,  if  there  are  at 
least  three  processors,  then  every  downward  closed  set  of  strings  without  repetitions 
is  the  level  of  knowledge  of  some  formula  under  some  history.  We  have  proved  that 
if  m  =  (*! , . . . ,  xfc)  is  the  set  of  minimal  elements  of  the  complement  of  some  downward 
closed  set  L,  then 

L=  rsN(Xi):i  ^  k. 

If  x  =  al  ■■■am  then  N(x)  =  (L  —  a^ ■■■{!,  —  am)s. 

The  next  theorem  will  show  us  how  to  realize  N(x).  We  take  a  sequence  of  broadcasts 
to  iV  —  1  processes  at  a  time.  Let’s  assume  that  at  time  T,  group  UT  receives  a  broadcast 
and  at  time  T  +  1  group  UT  +  1  (both  of  N  —  1  processors).  Then  the  processor  sending 
broadcast  at  T+  1  (aT+  { )  will  be  one  of  the  processors  in  UT+1n  UT,  and  the  message 
sent  will  be  the  string  of  common  knowledge  operators  CUt  ■  ■  ■  CVi  followed  by  a  fixed 
formula  P  private  to  one  of  the  processes  in  (S  —  am ). 

Theorem  20.  Let  H  be  a  history  in  which  the  private  information  of  sx  is  Ps,  and  every 
event  GT  occurring  in  H  at  time  T  for  T=  1 ,...,/,  ( where  l  =  length(H))  is  of  the  form: 
(GT)l  =  bc(aSr,  N  —  { aT},mT )  for  all  aS;  ^  aT ,aSr  ^  aT„  ls  (Gr);  =  null  otherwise.  Where 
mx  =  PSl,mT+1  =  CN_[aT]mT.  Then: 

L(PSi>H)  —  dc{C  N  _  {aiy  •  C  N  _  ,ai  j). 

Proof  Induction  on  the  length  of  H.  Clearly  if  /  =  1  then  L{PSl,H)~  CN_{a 

Suppose  that  for  histories  up  to  the  length  /  —  1  theorem  is  true.  Now  we  can  use 


188 


Rohit  Parikh  and  Paul  Krasucki 


'  ™  -  KaU  ,v  i ail 

Since  by  induction  hypothesis  (IH)  L(PSi,H')  =  dc(CN_!ai  } ••■CN_,a  ()  then 


□ 


COROLLARY  1 

In  a  system  with  at  least  3  processors,  for  every  x  in  Es,  there  exists  a  history  H  and 
a  formula  A  such  that  L(A,H)  is  just  the  set  of  strings  without  repetitions  in  N(x). 

Theorem  21.  Every  downward  closed  set  L  of  strings  without  repetitions  is  L(A,H ) 
for  suitable  A  and  H  in  a  synchronous  system  with  at  least  3  processors. 

Proof.  Let  {xl5...,xk}  be  the  set  of  minimal  elements  of  the  set  of  strings  without 
repetitions  which  are  not  in  L.  Then  Lis  Ar(x1,...,xt). 

If  k  =  0  then  all  strings  without  repetitions  are  in  Land  Lis  just  common  knowledge, 
which  can  be  achieved  by  taking  the  formula  A  to  be  a  tautology. 

Otherwise  for  each  Xj  we  can  find  a  Pj  and  a  history  Hj  such  that  L(Pj,Hj)  is 
exactly  N(xj).  Now  let  H  be  Hl;...,Hk  and  A  be  the  conjunction  of  the  Pj.  (We 
assume  the  Pj  are  all  independent  so  that  if  i  #  j  then  H{  conveys  no  information 
about  Pj.)  Then  ye  L{A,H)  iff  H\=yA  iff  for  all  j,  H^yPj  iff  for  all  j,  Hj¥yPj  iff  for  all 


j,yeN(xj)  iff  ye  L. 


□ 


Theorem  22.  In  a  two  processor  system  with  only  synchronous  communication  available, 
no  finite  level  containing  strings  of  length  ^  2  can  be  achieved  for  any  formula  A. 

Proof.  Suppose  that  Ht  K1K2A.  If  in  the  empty  history  1  knows  that  2  knows  A, 
then  A  must  be  true  in  all  histories  (A  persistent)  and  therefore  must  be  common 
knowledge.  Otherwise  1  must  have  learned  that  K2A  in  H,  so  there  must  have  been 
a  communication  between  1  and  2  to  that  effect,  but  there  are  only  synchronous 
communications  in  the  system,  and  these  create  common  knowledge.  □ 

4.3  Mixed  case  -  both  kinds  of  communication  available 

t 

If  we  have  both  kinds  of  communications  in  the  system,  we  can  directly  realize  every 
level.  We  show  first  how  to  realize  a  level  in  the  normal  form.  Later  we  will  also  show 
how  to  realize  the  level  given  by  the  set  of  minimal  strings  of  the  complement.  As 
we  will  see  from  the  examples  in  a  following  section,  if  we  have  some  specification 
of  the  level  of  knowledge  to  be  achieved,  which  is  incomplete  (doesn’t  specify  a  unique 
level),  then  the  two  constructions  will  give  us  different  levels  of  knowledge  (maximal 
and  minimal)  satisfying  requirements. 

Construction 


k 


L=  [j  dc(Xi), 


i  =  1 


where  Xt  is  star-linear. 
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We  take  A  =  Vf=1Ps(i)  where  s(i)eUj,  such  that  xi(1)  =  CVj.  We  create  //,  to  realize 
the  level  of  knowledge  dc(Xi)  of  a  formula  Ps(1)  for  every  i,  and  then  we  take  H  to 
be  the  concatenation  of  all  H- s. 

We  can  also  construct  a  level  specified  by  a  set  of  minimal  strings  of  the  complement. 

Construction. 

L=  N(x1,...,xk), 
where  xfe£s. 

We  take  A  —  A*=1  Ps(i)  where  s(i)eUj,  such  that  xt  —  yCVj.  We  create  Ht  to  realize 
the  level  of  knowledge  jV(xf)  of  a  formula  Ps(i)  for  every  i,  then  we  take  H  to  be  a 
concatenation  of  all  H- s. 

Let  x;  =  CUm---CVl. 

(i)  s{i)  initially  sends  a  broadcast  Ps(i)  to  all  the  processes  not  in  U x. 

(ii)  Sender  of  the  last  broadcast  (to  N  —  U  f)  sends  asynchronously  all  that  he  knows 
about  Ps(i)  to  one  of  the  processes  not  in  Uj+l  (if  there  was  no  broadcast  yet,  s(i )  is 
a  sender). 

(iii)  Recipient  of  the  last  asynchronous  message  broadcasts  all  that  he  knows  about 

Ps(i)  to  all  processes  in  N  —  Uj+  x .  □ 

4.4  Limited  n-casts 

If  in  our  system  we  have  only  a  limited  broadcast  capability,  then  not  all  levels  of 
knowledge  can  be  achieved.  Precisely: 

Theorem  23.  If  in  a  system  of  N  processes,  all  the  broadcasts  are  to  groups  of  n 
processes  (n  <  N),  then  no  CVA  can  be  realized  for  any  U  with  \  U\>  n,  for  any  formula 
A  which  is  not  true  in  all  the  histories. 

Proof.  Broadcast  bc{i,  U,  m)  creates  common  knowledge  of  m  among  U.  A  sequence 
of  broadcasts  of  up  to  n  processes  (say  to  Ul,...,Uk)  will  create  dc(CvCUk  •••C1) 
where  all  Ut  for  i=  l,...,/c  have  at  most  n  elements.  Let  U'  be  a  set  of  n'  elements 
where  n'  >  n.  Then  Cv^dc(CuCVk  so  it  is  not  realized  by  a  sequence  of  k 

n-casts.  □ 

Theorem  24.  In  the  usual  version  of  CSP  with  ‘2 -casts’,  C^f/4)  with  \U\>2  cannot 
be  achieved  for  any  A  which  was  not  common  knowledge  to  begin  with. 


5.  Examples 

Every  level  of  knowledge  can  be  generated  in  an  appropriate  system.  Practically  there 
may  be  two  kinds  of  reasons  we  want  to  obtain  some  particular  level  of  knowledge. 

(1)  We  want  to  have  enough  knowledge  in  the  system  so  we  specify  which  strings  L 
we  want  to  have  included  in  a  realized  level  L{A,  H ). 

(2)  We  want  to  prevent  certain  processes  from  knowing  some  facts  about  the  others. 
In  such  a  situation  we  would  specify  a  set  L'  of  the  strings  we  don’t  want  to  include 
in  L(A,H).  Let  us  consider  now  a  pair  (L,L)  as  a  knowledge  specification  for  our 
system,  and  consider  how  we  might  realize  it. 
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1.  If  L'ndc(L)  is  not  empty  them  (L,  L')  cannot  be  realized  for  any  formula. 

For  example  if  L=  {Kl  K2  K3},  L'  =  {Kl,  K3}  then  there  is  no  system  which 

realizes  (L,  L'). 

So  a  necessary  condition  for  realizability  of  (L,  L')  is: 

Vme  L  Vm'e  L'  ~ i  ( m '  ^  m). 

Now  we  have  two  possibilities: 

2.  If  dc(Ls  —  L’)  =  dc{L)  then  (L,  L')  uniquely  specifies  the  level,  which  can  be 
realized  for  an  appropriate  formula  in  an  appropriate  system. 

Let  us  look  at  an  example.  We  have  some  fact  F,  and  we  want  to  create  a  history 
H ,  such  that  Ht KAKBF,  H\=  KBKCF,  H\^KCKAF  (these  three  facts  already  imply 
that  H\=KaF,  H\^KbF,  H\=KcF ,  but  we  also  want  H\^~\(KAKCF),  H\z~i(KbKaF), 
and  // 1=  — i  ( KcKbF ).  So  our  specification  is  (L,  L'),  where  L  =  {KAKB,  KBKC,  KCKA}, 
L'  =  {KaKc,KbKa,KcKb}.  Notice  that  in  this  case,  since  we  exclude  the  possibility 
of  common  knowledge  between  any  pair  of  processes,  our  level  of  knowledge  must  be 
finite.  In  fact  these  requirements  characterize  the  level  of  knowledge  completely.  The 
level  we  are  looking  for  is  exactly  the  set  L(A,H)  —  { KAKB ,  KBKC,  KCKA,  KA,  KB,  Kc,e}. 

We  can  attain  (L,  L')  by  a  protocol  in  which  (for  example)  all  A,  B ,  C  independently 
learn  about  F,  A  sends  a  message  F  to  C,  C  sends  F  to  B,  B  sends  F  to  A,  and  all 
messages  are  sent  asynchronously.  In  a  synchronous  system  we  can  achieve  (L,  L') 
by  a  protocol  in  which  first  we  use  a  broadcast  to  bring  about  common  knowledge 
of  the  fact  (p  A  q  A  r)  ->  F.  p  is  a  private  fact  of  .4,  q  is  a  private  fact  of  B.r  is  a  private 
fact  of  C.  A  broadcasts  p  to  {A,  B)  and  later  B  broadcasts  C,A  B]p  to  {B,  C)  creating 
N(KaKc).  Similarly  first  B  broadcasts  q  to  { B ,  C},  later  C  broadcasts  C,B  c,q  to  {A,  C } 
creating  N{KBKA),  finally  C  broadcasts  r  to  {A,C},  and  then  A  broadcasts  C,AC^r 
to  {A,  B},  creating  N(KCKB). 

3.  If(L,  L')  can  be  realized  but  there  isxeZ5such  that  x^dc(L)  and  ~ i  (3m'eL',  m!  ^  x) 
then  the  level  is  not  uniquely  specified  (we  can  include  x  in  (L,  L')  but  we  don’t  have  to). 

The  smallest  level  realizing  (L,  L')  is  dc(L)  and  the  largest  one  is  N(L’)  (or 
dci’L5-  L')). 

Let  for  example  L={C,12)X3},  L'  =  {K3K2}.  In  order  to  realize  (L,L')  we  can 
take  dc(L).  This  can  be  done  by  taking  formula  A  to  be  P3,  and  a  history: 

s(3,  l,K3P3);r(l,3,K3P3); bc{\,  {1,2}, K3P3). 

Another  possibility  is  to  realize  N(K3K2). 


We  thank  Konstantinos  Georgatos  for  a  very  careful  reading  of  the  manuscipt. 


References 


Aumann  R  1976  Agreeing  to  disagree.  Ann.  Stat.  4:  1236-1239 
Chandy  M,  Misra  J  1986  How  processes  learn.  Distrib.  Comput.  1(1):  40-52 
de  Jongh  D  H  J,  Parikh  R  1977  Well  partial  orderings  and  hierarchies.  Proc.  K.  Ned.  Akad. 
Wet.  A80:  195-207 

Halpern  J,  Moses  Y  1990  Knowledge  and  common  knowledge  in  a  distributed  environment. 
J.  Assoc.  Comput.  Mach.  37:  549-578 


Levels  of  knowledge  in  distributed  systems 


191 


Halpern  J,  Zuck  L  1987  A  little  knowledge  goes  a  long  way:  simple  knowledge-based  derivations 
and  correctness  proofs  for  a  family  of  protocols.  Proc.  6th  ACM  Symp.  on  Principles  of 
Distributed  Computing  (New  York:  ACM  Press)  pp.  269-280 
Higman  G  1952  Ordering  by  divisibility  in  abstract  algebras.  Proc.  Lon.  Math.  Soc.  2:  326-336 
Lamport  L  1978  Time,  clocks  and  the  ordering  of  events  in  a  distributed  system.  Commun. 
ACM  21:  558-565 

Lewis  D  1969  Convention,  a  philosophical  study  (Harvard:  University  Press) 

Moses  Y,  Tuttle  M  1988  Programming  simultaneous  actions  using  common  knowledge. 
Algorithmica  3:  121-169 

Parikh  R  1986  Levels  of  knowledge  in  distributed  computing.  IEEE  Symposium  on  Logic  in 
Computer  Science  (New  York:  IEEE  Press)  pp.  322-331 
Parikh  R,  Krasucki  P  1986  Levels  of  knowledge  in  distributed  computing,  Brooklyn  College 
Dept,  of  CIS.  Technical  Report 

Parikh  R,  Krasucki  P  1990  Communication,  consensus  and  knowledge.  J.  Econ.  Theory  52: 
178-189 

Parikh  R,  Ramanujam  R  1985  Distributed  processes  and  the  logic  of  knowledge.  Logics  of 
programs.  Lecture  Notes  in  Computer  Science.  Vol.  1 93  (Berlin:  Springer- Verlag)  pp.  256-268 
Schiffer  S  1972  Meaning  (Oxford:  University  Press) 


. 


. 


' 


V 


Sadhana,  Vol.  17,  Part  1,  March  1992,  pp.  193-220.  ©  Printed  in  India. 


Scalable  concurrent  computing 


NALINI  VENKATASUBRAMANIAN*,  SHAKUNTALA  MIRIYALA+ 
and  GUL  AGHA 

Department  of  Computer  Science,  University  of  Illinois  at  Urbana- 
Champaign,  Urbana,  IL  61801,  USA 

*Present  address:  Hewlett  Packard  Company,  19111  Pruneridge  Avenue 
MS44UT,  Cupertino,  CA  95014,  USA 

fPresent  address:  Vista  Technologies,  1100  Woodfield  Rd  Suite  108, 
Schaumburg,  IL  60173,  USA 


Abstract.  This  paper  focusses  on  the  challenge  of  building  and 
programming  scalable  concurrent  computers.  The  paper  describes  the 
inadequacy  of  current  models  of  computing  for  programming  massively 
parallel  computers  and  discusses  three  universal  models  of  concurrent 
computing  -  developed  respectively  by  programming,  architecture  and 
algorithm  perspectives.  These  models  provide  a  powerful  representation 
for  parallel  computing  and  are  shown  to  be  quite  close.  Issues  in  building 
systems  architectures  which  efficiently  represent  and  utilize  parallel 
hardware  resources  are  then  discussed.  Finally,  we  argue  that  by  using  a 
flexible  universal  programming  model,  an  environment  supporting 
heterogeneous  programming  languages  can  be  developed. 

Keywords.  Scalable  concurrent  computing;  massively  parallel  computers; 
systems  architectures;  heterogeneous  programming  languages. 


1.  Introduction 

Computers  have  penetrated  into  many  branches  of  learning,  industry  and  art.  The 
increasing  scope  of  application  domains  for  computers  has  brought  new  demands 
for  computer  technology.  At  the  same  time,  computers  and  programming 
methodologies  have  undergone  dramatic  changes  in  the  past  couple  of  decades,  both 
in  the  conceptual  view  of  a  program  as  well  as  the  physical  layout  of  the  machine. 
An  analysis  of  these  changes  suggests  that  computer  systems  may  be  divided  into  five 
generations  as  shown  in  table  1  (Hwang  &  Briggs  1984).  The  foundational  basis  of 
the  first  four  generations  of  computers  is  the  stored-program  concept  or  the  von 
Neumann  model.  This  paper  describes  fifth  generation  computer  systems  and  focusses 
on  developments  in  software  technology  necessary  to  realize  it. 

The  outline  of  this  paper  is  as  follows.  Section  2  describes  the  von  Neumann  model. 
This  model  has  fundamental  limitations  for  scalability  which  must  be  overcome.  By 
scalability,  we  mean  the  ability  of  the  system  to  display  an  increase  in  performance 
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Table  1.  Classification  of  computer  systems  into  generations. 


Generation 

Hardware  advances 

Software  advances 

First 

Electromechanical 

devices 

Machine  language 

Second 

Transistors 

Assembly,  Fortran, 

Algol 

Third 

Small  and  medium  scale 
integrated  circuits 

Multiprogramming 

Fourth 

Large  scale  integrated 
circuits 

Advanced  compilers 

Fifth 

Very  large  scale 
integrated  circuits 

Parallel  programming 

with  the  addition  of  computational  resources.  We  then  review  concurrent  computer 
architectures  and  their  scalability  characteristics.  Section  3  reviews  universal  models 
for  concurrent  computing.  Specifically,  we  describe  ‘actors’,  a  scalable  model  of 
concurrent  programming  which  overcomes  the  limitations  of  the  von  Neumann 
model.  A  number  of  problems,  such  as  composability  of  independent  systems, 
debugging  and  effective  system  management  etc.,  arise  in  concurrent  systems.  Section  4 
addresses  problems  in  realizing  concurrent  computation,  namely,  new  techniques  for 
managing  the  computational  resources  of  large  numbers  of  concurrent  processors. 
Section  5  discusses  the  use  of  actors  to  provide  interoperability  in  programming 
environments  -  i.e.,  to  provide  a  software  technology  which  permits  use  of  multiple 
programming  paradigms.  The  final  section  discusses  some  ongoing  research  projects 
and  future  directions  in  scalable  concurrent  computing. 


2.  Concurrent  computer  architectures 

In  the  von  Neumann  model,  the  hardware  consists  of  a  processor,  a  memory  module 
(also  called  storage )  and  a  link  to  communicate  between  them  as  shown  in  figure  1. 
A  program  is  a  sequence  of  instructions  stored  in  memory  and  executed  by  the 
processor.  In  order  to  execute  an  instruction  at  least  three  communications  have  to 
pass  through  the  link  -  the  instruction  (control),  operands  (data  from  memory  to 
processor)  and  results  (stored  back  in  memory).  However,  having  a  single 
communication  bus  poses  bandwidth  limitations  on  the  amount  of  information  that 
can  be  transmitted  between  the  memory  and  processor;  this  imposes  a  performance 
bottleneck,  traditionally  known  as  the  von  Neumann  bottleneck. 

The  hardware  components  in  a  von  Neumann  computer  operate  by  performing  a 


processor 

-  communication 

bus 


memory 


Figure  1.  The  von  Neumann  model. 
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sequence  of  transitions  on  a  state1.  The  sequence  of  transitions  is  specified  by  the 
user  through  a  program  as  a  function  of  the  state.  Note  that  the  programmer  specifies 
not  only  the  transitions  which  must  be  performed  but  also  a  strict  order  on  the 
execution  of  these  transitions  by  means  of  program  instructions.  In  the  model 
presented  above,  a  single  processor  is  the  sole  unit  of  computational  power  which 
results  in  the  following  two  limitations: 

Physical  limits  on  how  close  components  can  get.  Technological  advances  have  led 
to  the  miniaturization  of  components  as  well  as  increase  in  processing  speed.  In  fact, 
processor  speeds  have  doubled  every  eight  years. 

Economic  considerations  on  the  cost  of  a  single,  highly  sophisticated  processor.  One 
solution  is  to  have  a  number  of  inexpensive  units,  all  working  simultaneously  so  that 
the  overall  throughput  of  the  machine  is  increased.  This  is  called  parallel  processing. 

Industrial  automation  accompanied  by  the  expanding  size  and  nature  of  problems 
which  have  to  be  handled  by  computers  has  placed  a  premium  on  speed,  efficiency 
and  accuracy  of  computation.  Weather  prediction,  nuclear  physics,  modelling  physical 
and  geological  phenomena,  genetic  code  mapping,  space  exploration,  flight  and  space 
vehicle  control,  expert  systems  for  medical  diagnosis  are  representative  problems. 
Applications  run  on  computers  may  be  numeric  or  non-numeric.  In  general,  the  use 
of  computers  is  expanding  from  just  number  crunching  to  include  symbolic  processing, 
processing  of  non-numeric  data  such  as  pictures,  text  and  sentences,  and  open 
information  systems  which  are  characterized  by  continuous  information  flow  between 
autonomous  units. 

Nondeterminism  is  a  key  feature  of  the  real  world  systems  directly  modelled  in  an  open 
system.  Nondeterminism  has  many  sources.  For  example,  when  searching  a  very  large 
database  for  some  specific  information,  it  may  be  necessary  to  explore  many  possible 
choices.  It  is  not  clear  when  we  will  arrive  at  a  solution  or  which  paths  to  investigate 
in  the  process.  Searching  many  paths  for  a  possible  solution  places  a  premium  on 
performance.  To  satisfy  this  growing  demand  for  performance  we  can  exploit 
parallelism. 

Computational  models  are  inspired  both  by  concern  for  representing  real  world 
problems  and  by  architectural  considerations.  In  building  architectures  which  exploit 
parallelism,  there  is  an  obvious  tradeoff  between  the  number  of  processing  units  and 
their  complexity.  Variations  in  the  degree  of  coupling  of  these  two  factors  has  given 
rise  to  a  number  of  computational  models  for  enhancing  performance.  At  one  end 
of  the  spectrum,  there  are  machines  with  large  numbers  of  very  simple  processors. 
The  degree  of  concurrency  is  very  high,  but  the  power  and  complexity  of  each 
individual  processing  unit  is  low.  The  Connection  Machine  (Hillis  1985)  is  one  such 
computer  available  in  the  market.  On  the  other  hand,  there  are  multiprocessor 
architectures  which  use  sophisticated  processing  units.  The  Encore  Multimax  is  an 
example  of  a  commercial  multiprocessor  machine. 

2.1  High  performance  uniprocessor  architectures 

Supercomputers  exploit  parallelism  by  using  multifunction  pipelines;  i.e.,  their  central 
processing  units  (CPU)  consist  of  multiple  functional  units.  Multiple  functional 
units  are  processing  units  that  can  perform  more  than  one  function  (operation) 
simultaneously.  Each  functional  unit  is  equipped  with  a  set  of  registers  to  store  the 


‘A  state  is  the  data  stored  in  the  memory  of  the  computer  at  a  given  point  in  time. 
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data  resulting  from  computation  on  that  functional  unit.  The  multiple  functional 
units  also  share  a  set  of  general  purpose  registers  that  communicate  with  main 
memory.  Results  from  one  functional  unit  can  be  used  as  input  to  other  functional 
units  via  a  high  speed  internal  bus.  This  brings  us  to  the  concept  of  pipelined 
computation.  In  pipelined  parallelism,  a  complex  operation  O  is  broken  up  into  a 
sequence  of  simpler  and  faster  operations  o1,o2,...,o„.  Different  operations  can 
execute  on  different  data  units  simultaneously;  for  instance,  the  output  from  ox  can 
be  automatically  piped  into  the  functional  unit  executing  o2,  while  new  data  is  input 
to  ox .  A  number  of  existing  computers  exploit  pipelined  parallelism  in  their  processing 
units,  for  example,  CRAY,  CDC-7600  and  IBM  360/91. 

Studies  on  supercomputer  development  over  the  past  years  have  indicated  that 
there  is  only  a  marginal  increase  in  the  sequential  speed  of  supercomputers.  The 
earliest  Cray  machines  operated  at  160  megaflops  (floating  point  operations  per 
second),  while  the  more  recent  Cray  X-MP  operates  at  a  peak  performance  of  210 
megaflops.  Current  supercomputers  have  multiple  processors  (up  to  4)  in  addition 
to  multifunctional  units,  but  do  not  provide  an  order  of  magnitude  change  in 
performance. 

The  workstation  industry  has  benefited  heavily  by  the  introduction  of  RISC  (Reduced 
Instruction  Set  Computer)  processors  which  simplify  the  design  of  processors  by 
transferring  some  of  the  work  to  software.  For  example,  the  original  VAX  uniprocessor 
exhibits  a  performance  of  the  order  of  1  MIP  (million  instructions  per  second)  for 
typical  uniprocessor  RISC-based  workstations.  Today,  this  figure  has  increased  to 
20-25  MIPS.  Typically,  RISC  processors  also  exploit  pipelined  parallelism. 


2.2  Parallel  architectures 

Parallel  architectures  are  classified  based  on  the  manner  in  which  they  exploit 
concurrency  into  data  parallel  machines  and  control  parallel  machines.  Data  parallelism 
allows  the  same  operations  to  be  independently  performed  on  each  element  of  a  large 
aggregate  of  data.  By  contrast,  control  parallelism  allows  multiple  threads  of  execution. 
Control  parallelism  is  more  general:  implicit  in  control  parallelism  is  the  fact  that 
each  thread  of  execution  may  involve  distinct  data. 

2.2a  Data  parallel  models:  Machines  based  on  the  data  parallel  model  are 
frequently  known  as  SIMD  (single  instruction  multiple  data)  machines  (see  figure  2). 
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Figure  2.  The  data  parallel  model. 
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M  -  memory 

p  .  processing  element 

C  -  local  cache 

Figure  3.  The  shared  memory  model. 

Each  instruction  stream  operates  on  multiple  units  of  data  such  as  a  vector  or  array 
instead  of  on  a  single  operand.  A  single  control  unit  coordinates  the  operation  of 
the  multiple  processors.  Synchronous  computers  using  global  clocks  are  quite  special 
purpose  and  rather  restrictive  in  their  model  of  concurrent  computation.  This 
approach  is  exemplified  by  the  connection  machine  (Hillis  1985)  and  conventional 
array  processors. 

2.2b  Control  parallel  models:  Control  parallel  computers  can  be  divided  into  two 
broad  classes:  shared  memory  machines  and  message-passing  concurrent  computers 
(also  called  multicomputers)  (figures  3  and  4).  Shared  memory  computers  have  multiple 
processors  and  share  a  global  memory.  For  efficiency  reasons,  each  processor  also 
has  a  local  cache.2  Separate  caches  create  the  problem  of  maintaining  consistency 


M  -  memory 

p  -  processing  element 

q  -  local  cache 

Figure  4.  The  message  passing  model. 


2  A  cache  is  a  fast,  expensive  memory  unit  between  the  processor  and  the  main  memory  and 
is  used  to  hold  frequently  accessed  data. 
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between  caches  when  processors  may  modify  shared  data  in  their  local  caches 
simultaneously.  The  shared  memory  computers  which  have  been  built  typically  consist 
of  16  to  32  processors.  Large  numbers  of  processors  create  increased  contention  for 
access  to  the  global  memory.  The  contention  for  shared  information  increases  as  the 
computing  resources  increase.  As  a  result,  shared  memory  architectures  are  not 
scalable  (Dally  1986). 

Multicomputers  have  evolved  out  of  work  done  by  Charles  Seitz  and  his  group  at 
Caltech  (Athas  &  Seitz  1988).  Configurations  of  multicomputers  with  only  64 
computers  exhibit  performance  comparable  to  conventional  supercomputers. 
Multicomputers  use  a  large  number  of  small  programmable  computers  (processors 
with  their  own  memory)  which  are  connected  by  a  message-passing  network. 

Depending  on  the  amount  of  memory  per  component,  multicomputers  may  be 
divided  into  two  classes,  namely,  medium-grained  multicomputers  and  fine-grained 
multicomputers.  Two  generations  of  medium-grained  multicomputers  have  been  built. 
A  typical  first  generation  machine  (also  called  the  cube  or  the  hypercube  because  of 
its  communication  network  topology)  consisted  of  64  nodes  and  delivered  64  MIPS. 
Its  communication  latency  was  in  the  order  of  milliseconds.3  On  the  other  hand,  a 
typical  second  generation  medium-grained  multicomputer  has  256  nodes  and  is 
projected  to  carry  out  2-5  K  mips  and  has  a  message  latency  in  the  order  of  tens 
of  microseconds.  Third  generation  machines  are  currently  being  built  and  are  expected 
to  increase  the  overall  computational  power  by  two  orders  of  magnitude  and  reduce 
message  latency  to  a  fraction  of  a  microsecond  (Athas  &  Seitz  1988). 

The  frontiers  of  multicomputer  research  are  occupied  by  work  on  fine-grained 
multicomputers.  Two  projects  building  experimental  fine-grained  multicomputers  are 
the  J-Machine  project  by  William  Daily’s  group  at  MIT  (Dally  &  Wills  1989, 
pp.  19-33)  and  the  Mosaic  project  by  Charles  Seitz’s  group  at  Caltech  (Athas  &  Seitz 
1988).  A  number  of  other  innovative  architectures  such  as  dataflow,  reduction 
machines  and  logic  programming  engines  have  been  proposed  (Arvind  &  Culler  1986; 
Shapiro  1987). 

Concurrent  computing  involves  partitioning  and  communication  of  tasks. 
Partitioning  involves  the  splitting  up  of  a  process  into  many  threads  of  computation 
which  may  execute  in  cooperation.  Threads  belonging  to  the  same  process  interact 
with  each  other  and  the  underlying  system  must  provide  for  communication  between 
them.  It  is  also  necessary  that  the  threads  executing  in  parallel  synchronize  or 
cooperate  with  each  other  without  conflicts.  Thus  at  a  higher  level  coordination  is 
required. 

Partitioning  may  be  explicit  where  the  user  divides  the  tasks  into  a  few  large  threads 
or  heavyweight  objects  which  may  execute  in  parallel.  This  is  referred  to  as  coarse 
grained  partitioning.  In  this  approach,  however,  concurrency  inherent  in  each  of  the 
tasks  is  not  fully  exploited.  The  need  for  a  high  degree  of  parallelism  leads  to  fine 
grained  partitioning.  Typically,  partitioning  a  task  into  the  grained  threads  is 
performed  by  the  compiler.  Here,  the  task  is  split  into  many  small  threads  or 
lightweight  objects  (fine  grained  parallelism).  Very  large  scale  integration  (VLSI)  is  an 
elegant  medium  for  expressing  this  form  of  parallel  computation.  The  computational 


3  Communication  latency  is  a  measure  of  the  time  delay  involved  in  transmitted  information 
from  source  to  destination.  As  communication  latency  increases,  more  time  is  required  to 
transmit  information  from  one  processing  unit  to  another.  As  a  result,  the  performance  of  the 
system  drops. 
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quanta  obtained  by  partitioning  a  task  may  need  to  communicate  with  each  other 
and  this  communication  introduces  additional  overhead.  The  presence  of  smaller 
threads  introduces  more  communication  which  may  offset  the  benefits  of  parallelism. 

Communication  is  needed  to  transfer  information  from  one  processor  to  another. 
Communication  strategies  may  be  synchronous  or  asynchronous. 

•  Synchronous  communication:  Both  the  sender  and  receiver  must  be  available  before 
communication  begins.  A  real  life  example  of  synchronous  communication  is  a 
telephone  that  waits  for  users  at  both  ends  to  be  available  at  the  same  time  before 
conversation  can  occur. 

•  Asynchronous  communication:  Here,  the  sender  and  receiver  can  operate 
independently,  the  sender  can  send  information  without  waiting  for  the  receiver 
to  be  ready  to  accept  the  information.  Communication  that  takes  place  through 
a  postal  system  is  a  real-world  example  of  asynchronous  communication.  Any 
communication  between  two  people  or  two  points  goes  through  some  post  office. 
Note  that  it  is  not  necessary  to  have  a  centralized  controller  in  such  a  system. 
The  presence  of  buffers  at  the  receiver  allows  us  to  store  the  messages  arriving 
from  possibly  different  senders.  This  kind  of  communication  is  called  buffered 
asynchronous  communication. 

When  there  are  two  or  more  processes  executing  in  parallel,  data  dependencies 
dictate  the  threads  which  need  to  cooperate  or  synchronize.  This  cooperation  or 
information  transfer  is  achieved  by  means  of  efficient  synchronization  primitives 
implemented  in  hardware. 

With  increased  parallelism,  we  believe  that  the  current  wave  of  technologies  and 
mechanisms  will  cause  a  shift  in  programming  paradigms.  In  particular,  from  a 
programming  language  standpoint,  we  are  observing  a  shift  from  textual  to  visual 
(graphic)  programs  (Miriyala  1991).  From  an  implementation  standpoint,  there  is  a 
shift  from  static  to  dynamic  resource  management  (Venkatasubramanian  1991). 


3.  Universal  models  for  scalable  concurrent  systems 

3.1  Inadequacy  of  existing  software  strategies 

Computers  available  in  the  market  today  are  not  sufficiently  powerful  to  compete 
with  the  pressing  demand  for  computational  speed  and  power.  Hardware  advances 
have  been  so  rapid  that  today  it  is  possible  to  think  about  packing  massive 
computational  power  into  a  few  inches  of  silicon.  The  advent  of  VLSI  technologies  has 
also  brought  forth  highly  concurrent  hardware.  This  alone  is  not  sufficient  to  satisfy 
our  requirements.  The  computational  capacities  of  these  hardware  modules  can  be 
exploited  only  through  efficient  software  methods  to  program  these  machines.  The 
challenge,  therefore,  lies  in  the  fact  that  software  technologies  have  not  scaled  up 
in  proportion  to  potential  hardware  advances.  Radical  software  techniques  have  to 
be  developed  to  concurrently  program  thousands,  and  in  the  future,  millions  of 
processors  to  work  in  concert. 

Most  familiar  programming  languages  are  based  on  the  von  Neumann  model  of 
computation.  A  major  drawback  of  von  Neumann  languages  (languages  based  on 
the  von  Neumann  model  of  computation)  is  that  the  architectural  model  of  the 
machine  is  the  basis  of  the  programming  model  of  the  language.  Thus  the  programmer 
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is  subjected  to  a  serious  constraint  in  forcing  his/her  programs  to  reflect  the  underlying 
architecture  when  one  should  be  worrying  about  the  most  effective  way  to  specify 
the  problem  without  imposing  any  artificial  constraints.  Only  then  can  he/she  fully 
harness  the  computational  power  of  the  underlying  hardware. 

Another  handicap  of  traditional  languages  is  their  difficulty  in  combining  existing 
program  modules  in  a  number  of  ways  to  achieve  different  functionalities.4 
Furthermore,  programs  written  in  these  languages  are  designed  for  deterministic 
computations  and  traditional  programming  strategies  break  down  when  we  have  to 
deal  with  nondeterminism. 

Scalability,  in  the  context  of  parallel  computation,  implies  that  given  a  program 
with  sufficient  parallelism,  it  will  be  possible  to  increase  the  performance  of  the  system 
by  adding  physical  resources.  In  a  scalable  system,  no  alteration  of  the  application 
program  is  necessary  to  exploit  the  benefit  of  the  added  resources.  Scalability  is  a 
fundamental  requirement  of  open  systems ,  i.e.  of  information  systems  which  permit 
continuous  influx  of  input  from  new  and  different  sources. 

Realistic  modelling  of  natural  phenomena  and  real  world  systems  as  open  systems 
is  possible  because  an  open  system  has  the  following  attributes: 

(1)  decentralization; 

(2)  inherent  concurrency; 

(3)  organizational  cooperation. 

One  can  view  a  parallel  computation  model  as  a  set  of  abstractions  that  capture 
the  semantics  and  functionality  of  concurrent  program  execution.  This  problem  has 
been  approached  from  different  perspectives  and  various  models  of  concurrency  have 
been  proposed  by  language  theorists,  complexity  analysis  and  architects  of  parallel 
machines.  We  analyse  some  of  these  models  with  a  view  towards  determining  their 
suitability  as  a  general  purpose  parallel  programming  model  for  scalable  systems. 

There  are  a  number  of  closely  related  approaches  to  developing  universal  models 
of  concurrency.  A  conservative  language  strategy  to  defining  a  model  of  concurrency 
is  to  start  with  sequential  processes  and  add  communication  primitives.  One 
language-based  model  that  does  this  is  Communicating  Sequential  Processes  (CSP) 
developed  by  Tony  Hoare  and  others  (Hoare  1978).  In  CSP,  communication  is 
synchronous,  i.e.,  a  source  process  cannot  send  a  communication  until  a  destination 
process  is  ready  to  accept  it,  and  the  process  topology  is  statically  determined,  i.e.,  the 
process  configuration  cannot  be  altered  during  program  execution.  The  significant 
limitations  of  such  a  model  include  the  need  to  specify  all  concurrency  explicitly,  the 
need  to  predetermine  resources  to  be  used,  and  the  need  to  fix  synchronization  points. 

We  focus  on  the  three  universal  models  -  a  programming  model,  a  complexity 
model,  and  an  architectural  model.  A  universal  model  facilitates  the  development  of 
various  applications  without  any  concern  for  the  underlying  architectural 
configuration.  If  a  universal  model  can  be  used,  it  insulates  hardware  and  software 
developments  from  each  other:  it  is  possible  to  use  the  advances  in  hardware 
(software)  without  having  to  be  overly  concerned  about  equivalent  advances  in 
software  (hardware).  Interestingly,  the  proposed  universal  models  have  similar 
underlying  features.  We  will  discuss  the  following  classes  of  universal  models: 


4 This  property  is  called  composability. 
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A  programming  model:  Actors  is  a  model  of  concurrent  object  oriented  programming 
with  active  agents  developed  primarily  by  programming  language  theorists. 

A  complexity  model:  The  bulk  synchronous  parallel  (BSP)  model  proposed  by  Valiant 
(1990,  pp.  103-111)  is  an  intermediate  model  to  bridge  the  gap  between  concurrent 
programming  languages  and  parallel  architectures.  It  deals  with  complexity  issues 
involved  in  efficient  universality  and  is  a  useful  model  for  designers  of  parallel 
algorithms. 

An  architectural  model:  The  parallel  machine  interface  (PMI)  developed  by  Bill 
Dally  and  his  group  at  MIT  (Dally  &  Wills  1989,  pp.  19-33)  is  an  implementation 
oriented  machine  model.  The  PMI  aims  at  achieving  efficient  hardware 
implementations  of  primitive  abstractions. 

It  is  possible  to  measure  the  cost  of  implementing  different  operations  in  various 
programming  models  on  the  basis  of  the  primitive  operations  on  each  of  these 
universal  models.  We  discuss  them  in  greater  detail  below. 

3.2  The  actor  model 

The  actor  model,  first  proposed  by  Hewitt  (1977)  and  later  developed  by  Agha  (1986) 
captures  the  essence  of  concurrent  computation  in  distributed  systems  at  an  abstract 
level.  In  the  actor  paradigm  the  universe  contains  computational  agents  called  actors, 
which  are  distributed  in  time  and  space.  Each  actor  has  a  conceptual  location  (its 
mail  address )  and  a  behavior  as  illustrated  in  figure  5. 

The  only  way  one  actor  can  influence  the  actions  of  another  actor  is  a  send  the 
latter  a  communication.  An  actor  can  send  another  actor  (or  itself)  a  communication 
only  if  it  knows  the  mail  address  of  the  recipient.  On  receiving  a  communication,  an 
actor  processes  the  message  and  as  a  result  may  cause  one  or  more  of  the  following 
events: 

•  creation  of  a  new  actor, 

•  change  its  behavior,  and, 

•  send  a  message  to  an  existing  actor. 

In  asynchronous  communication,  the  sender  and  receiver  need  not  coordinate  message 
delivery.  If  two  messages  sent  to  an  actor  arrive  at  their  destination  simultaneously, 
there  must  be  some  mechanism  to  serialize  the  incoming  communications  and  execute 
both  messages.  To  ensure  this,  every  actor  is  equipped  with  a  mailbox  that  queues 
any  communications  received.  Therefore,  communication  in  the  actor  model  is 
asynchronous  and  buffered.  The  size  of  the  buffers  is  theoretically  unbounded. 

Although  actors  is  inherently  an  asynchronous  model,  it  is  possible  to  simulate 
synchronous  models  as  specializations  of  the  actor  model.  A  important  characteristic 
of  communication  in  the  actor  model  is  the  ability  to  communicate  mail  addresses. 
Thus  the  interconnection  topology  of  the  system  is  capable  of  changing  continuously. 
This  adds  to  the  reconfigurability  and  flexibility  of  the  actor  model;  for  example,  it 
allows  resource  management  decisions  such  as  object  to  processor  mapping  to  be 
directly  programmed.  Another  property  of  the  actor  model  is  the  guarantee  of  delivery. 
i.e.,  messages  in  the  system  will  eventually  reach  their  destination  actor.  This  implies 
that  mail  in  transit  cannot  be  indefinitely  buffered. 

Actor  execution  is  graphically  expressed  in  terms  of  event-diagrams  (see  figure  6). 
An  event-diagram  is  a  pictorial  representation  of  the  arrival  order  of  events  within 
a  thread  of  execution  and  the  causal  relationship  between  different  threads  of 
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Figure  5.  An  abstract  representation  of  actor  transitions:  When  an  actor  processes 
the  nth  communication,  it  determines  the  replacement  behavior  which  will  process 
the  (n  +  l)th  communication.  The  mail  address  of  the  actor  remains  unchanged. 
The  actor  may  also  send  communications  to  specific  target  actors  and  create  new 
actors. 


Figure  6.  Each  vertical  line  represents  the  linear  arrival  order  of  communications 
sent  to  an  actor.  In  response  to  processing  the  communications,  new  actors  are 
created  and  different  actors  may  be  sent  communications  which  arrive  at  their 
target  after  an  arbitrary  delay. 
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computation.  The  thick  vertical  line  in  figure  6  represents  the  actor  on  a  linear 
temporal  scale  with  time  floating  from  the  top  of  the  line  to  the  bottom.  Events  that 
occurred  earlier  lie  above  those  that  occur  later.  Event  diagrams  bring  out  the  concept 
of  local  time  and  local  state  in  the  actor  model.  Causality  connections  form  the 
fundamental  synchronization  mechanism  in  the  actor  model. 

The  actor  model  is  well-suited  for  fine  grained  computation  because  of  its  dynamicity 
and  its  ability  to  create  new  actors  inexpensively.  It  may  sometimes  to  necessary  to 
sequentialize  task  execution  using  one  of  the  following  methods: 

(1)  Introduce  causality  constraints  between  different  threads  of  execution.  This  does 
not  reduce  the  grainsize  of  a  task,  but  causes  sequentiality  that  allows  controlled 
resource  allocation. 

(2)  Execute  a  given  thread  of  control  sequentially  instead  of  in  a  functional  fashion. 
Either  user-defined  or  compiler-derived  annotations  could  be  used  to  specify  the 
sequentiality.  For  example,  we  could  have  both  sequential  and  parallel  versions  of  a 
recursive  method.  The  parallel  version  creates  new  tasks  for  every  recursive  call.  The 
sequential  version  is  called  when  the  argument  to  the  recursive  call  falls  below  a 
certain  value.  The  sequential  method  definition  executes  sequentially  to  completion 
without  creating  new  tasks. 

Baude  &  Vidal-Naquet  (1991,  pp.  184-195)  have  shown  that  the  asynchronous, 
message  passing  actor  model  is  as  powerful  as  the  traditional  PRAM  or  Parallel 
Random  Access  Memory  model  (used  to  analyse  the  complexity  of  parallel  algorithms) 
which  is  a  synchronous,  shared  memory  model. 

In  the  PRAM  model,  program  execution  is  deterministic  and  can  utilize  an 
unbounded  number  of  logical  processing  units.  The  logical  processing  units 
communicate  via  a  global  shared  memory.  Efficient  parallel  programs  are  defined  as 
those  programs  that  demonstrate  an  exponential  speedup  with  a  polynomial  increase 
in  the  number  of  processing  units.  Such  efficiently  parallelizable  programs  are 
classified  under  the  NC  class  of  programs. 

Efficiently  actor  parallelizable  problems  or  NCactor  are  defined  as  those  problems 
for  which  there  exists  an  actor  program  whose  time  complexity  (the  length  of  the 
execution  chain  created  by  the  actor)  is  a  polylogarithmic  function  of  the  input  size 
and  whose  size  complexity  (a  measure  of  the  space  needed  to  process  the  message) 
is  a  polynomial  function  of  the  input  size.  Baude  &  Vidal-Naquet  (1991,  pp.  184-195) 
provide  a  simulation  of  actors  by  PRAM  and  vice-versa.  In  particular,  they  show: 

hi  CPRAM  ^  NCactor 

In  summary,  the  actor  model  is  a  convenient  perspective  for  the  programmer  of  the 
parallel  machine  due  to  its  flexibility  and  simplicity  (programming  with  abstractions). 
However,  it  does  not  model  some  architectural  characteristics  -  such  as  the  overheads 
involved  in  message  passing. 

3.3  The  bulk  synchronous  parallel  model 

Valiant  (1990,  pp.  103-111)  proposed  the  bulk  synchronous  parallel  (bsp)  model  as 
a  suitable  bridge  between  parallel  languages  and  architectures.  The  BSP  model  consists 
of  three  units: 

•  components  -  which  are  computing  or  memory  units; 

•  router  -responsible  for  point  to  point  message  passing; 
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•  periodic  synchronization  -  that  performs  synchronization  of  all  or  a  subset  ol 

the  processors  with  a  periodicity  L. 

To  achieve  efficient  universality  results,  we  must  be  able  to  model  the  performance 
of  the  system  as  a  relation.  If  h  is  the  number  of  messages  sent  or  received  in  a 
computation  on  a  given  processor  and  g  is  the  sum  of  all  computations  in  the  system 
per  second  divided  by  the  number  of  data  words  delivered  per  second,  it  is  assumed 
that  all  communications  are  delivered  in  time  gh.  Let  the  startup  latency  or  cost  be 
s.  The  router  sends  and  is  sent  at  most  h  messages  in  a  superstep.  Such  supersteps 
are  also  called  h-relations.  Therefore,  the  cost  to  realize  an  h- relation  is  gh  +  s.  This 
technique  of  modelling  the  performance  of  a  system  as  a  relation  offers  parameterized 
controllability  and  is  one  of  the  basic  features  of  the  BSP  model  that  makes  it 
amenable  to  complexity  analysis. 

Memory  components  in  the  BSP  model  are  distributed  across  the  processing 
components  and  therefore  access  to  every  computational  unit  is  equally  likely.  To 
allow  efficient  symbolic  to  physical  address  mapping,  the  BSP  model  adopts  a 
pseudo-random  mapping  or  hashing  mechanism.  The  assumption  is  that  known  hash 
functions  are  used  and  they  can  be  computed  locally  and  inexpensively. 

Synchronization  in  the  BSP  model  is  carried  out  periodically  using  bulk  synchroniz¬ 
ation.  The  process  of  bulk  synchronization,  from  which  the  model  derives  its  name, 
is  as  follows.  Initially,  the  mean  delay  to  execute  each  operation  is  roughly  estimated. 
After  the  estimated  time  has  elapsed,  an  elected  processor  sends  out  a  sync  detect 
signal  to  all  the  other  processes  along  a  spanning  tree.  If  synchronization  is  achieved, 
we  can  proceed  with  the  next  phase  of  computation.  Otherwise,  a  new  timeslot  is 
allocated  and  the  process  is  repeated  till  synchronization  is  achieved. 

The  tasks  generated  are  executed  in  a  sequential  fashion  but  all  the  tasks  generated 
can  execute  in  parallel.  The  tasks  involved  in  a  bulk  synchronization  must  wait  until 
the  synchronization  has  been  achieved  for  further  continuation  of  the  executing  task. 
However,  it  is  possible  for  processing  components  to  switch  this  mechanism  off  and 
execute  without  waiting  for  synchronization.  As  a  consequence,  task  granularity  in 
the  BSP  model  is  controllable  by  varying  the  periodicity  of  synchronization,  L.  As  L 
increases,  the  granularity  of  the  program  also  increases.  The  type  of  algorithms  most 
naturally  expressed  using  the  BSP  model  are  PRAM  programs. 

The  BSP  is  a  possible  model  for  the  designer  of  a  parallel  algorithm.  Valiant 
(1990,  pp.  103-1 1 1)  demonstrates  how  the  BSP  model  can  be  embedded  on  theoretical 
models  like  PRAM  as  well  as  architectural  models  like  networks  of  systems.  However, 
the  level  of  abstraction  in  BSP  makes  it  difficult  to  specify  optimizations  that  may  be 
necessary  for  efficiency  -  in  particular,  no  specific  language  interface  or  architectural 
issues  can  be  addressed.  Furthermore,  the  model  limits  communication  costs  in  an 
architecture  by  assuming  the  feasibility  of  efficient  bulk  synchronization.  Thus  BSP 
would  need  to  be  modified  before  it  could  be  used  as  a  model  of  a  realistic  scalable 
system. 

3.4  The  parallel  machine  interface 

In  Dally  &  Wills  (1989,  pp.  19-33),  a  universal  machine  model  has  been  proposed 
to  support  various  parallel  models  of  computation,  the  Parallel  Machine  Interface 
(PMI).  Any  machine  model,  sequential  or  parallel,  must  have  abstractions  representing 
hardware  components,  i.e.,  memory,  instructions  and  instruction  sequencing.  One 
such  general-purpose  abstraction  in  a  sequential  machine  that  has  been  very  successful 


Scalable  concurrent  computing 


205 


is  the  notion  of  stack-based  storage  allocation.  Complex  modelling  primitives  lack 
flexibility  and  generality,  and  are  less  amenable  to  optimizations  and  therefore,  it  is 
important  to  keep  the  modelling  primitives  simple  and  straightforward. 

What  are  the  kinds  of  abstractions  needed  to  support  a  parallel  model  of 
computation?  Three  basic  requirements  of  any  parallel  model  are  communication, 
synchronization  and  naming.  Most  proposed  models  of  parallel  computation  like 
dataflow  (Arvind  &  Culler  1986),  static  message  passing,  shared  memory,  parallel 
logic  programming  (Goto  et  al  1988,  pp.  208-229)  and  concurrent  object  oriented 
programming  improvise  on  similar  implementations  of  these  primitive  abstractions. 
The  goal  is  therefore  to  design  efficient  hardware  implementations  of  these  primitive 
mechanisms  that  are  portable  across  different  programming  systems. 

A  parallel  machine  interface  isolates  the  issues  of  programming  models  from  the 
details  of  machine  organization  and  implementation.  This  abstract  model  of  parallel 
systems  is  then  embellished  with  special  features  in  order  to  support  a  particular 
programming  paradigm  efficiently  on  a  specific  target  architecture. 

A  parallel  architecture  interface,  called  Pi  has  been  proposed  based  on  the  PMI 
(Wills  1990)  and  the  implementation  of  several  machine  models  has  been  illustrated. 
An  abstract  machine  architecture  has  also  been  proposed  for  the  same  and  we  will 
use  this  abstract  architecture  implied  by  Pi  in  the  discussion  below. 

The  PMI  views  storage  as  a  collection  of  data  units  or  segments  which  are  logically 
related.  It  makes  no  assumptions  about  how  segment  names  are  interpreted  in  an 
implementation.  However,  some  translation  mechanism  must  be  built  to  support 
segment  addressing  and  access.  The  cost  of  accessing  a  segment  is  directly  dependent 
on  the  translation  scheme  chosen. 

The  PMI  currently  assumes  a  message  passing  model  of  communication,  but 
mechanisms  like  communication  via  shared  memory  or  RPC  (remote  procedure  calls) 
can  also  be  implemented.  The  communication  network  does  not  take  a  stand  on 
routing  and  buffering  mechanisms.  As  in  the  actor  model,  the  PMI  assumes  that  all 
messages  are  eventually  delivered  (albeit  with  an  arbitrary  delay  due  to  network 
latency)  and  that  there  is  no  message  order  preservation. 

The  PMI  models  the  three  primitive  operations  in  a  parallel  system  as  follows: 

(1)  Communication  is  represented  by  means  of  a  message  send.  Other  forms  of 
communication  like  shared  memory  reads  and  writes  are  more  complex  mechanisms 
implemented  in  terms  of  message  sends. 

(2)  Synchronization  is  of  two  kinds:  data  synchronization ,  which  is  needed  when  there 
is  a  data  dependence  between  executing  tasks,  and  control  synchronization,  which 
requires  that  a  task  must  complete  before  another  can  begin  execution.  The  mechanism 
used  to  implement  safety  and  correctness  in  synchronization  is  actor  firing  and 
progress  after  synchronization  is  guaranteed  by  I -structure  accesses  (Arvind  et  al  1987). 

(3)  Naming  issues  in  most  models  can  be  implemented  as  some  form  of  translation 
from  a  logical  name  to  a  physical  address  in  the  memory  of  the  system.  The  model 
assumes  that  the  translation  is  embedded  within  the  message  injection  and  reception 
mechanisms. 

In  addition,  the  Pi  model  defines  primitives  to  provide  information  regarding  the 
relative  proximity  of  objects  to  a  particular  node. 

The  abstract  machine  implied  by  the  Pi  model  consists  of  a  set  of  nodes 
interconnected  by  means  of  a  communication  network.  Although  the  Pi  model  is 
fine-grained  (where  grain  size  is  measured  in  terms  of  the  node  size),  it  should  be 
observed  that  fine-grained  systems  are  upward-compatible  with  larger  grained  systems. 
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Table  2.  Universal  models  of  parallel  computation. 


BSP 

Actor 

PMI 

Storage  model 

Memory 
and  components 

Actors  as 

Abstract  Data 

Type  (ADT) 

Logical  collection 
of  data  units 
called  segments 

Communication 

Asynchronous 
point  to  point 

Asynchronous 
point  to  point 

Asynchronous 
point  to  point 

Synchronization 

Bulk 

synchronization 

Causality  and 
history-sensitivity 

Access  attributes 

Naming 

Pseudo-random 
mapping  of  symbolic 
names  to  physical 
addresses 

Unique  actor  names 

Nametable  for 
address  translation 

Process/object 

Topology 

Unclear 

Dynamic  creation 
Dynamic  configuration 

Dynamic  creation 

Dynamic  configuration 

Expressing 

Parallelism 

Sequential  tasks 
Specified 

Parallelism  default 
Implicit  synchronization 

Fine-grained  threads 
Explicit  synchronization 

Locality 

Hash  function 
distributes  memory 
components 

Actors  are  a 
unit  of  locality 

Primitives  to 
support  spatial 
locality 

In  other  words,  architectures  for  the  fine-grained  computation  can  support  larger 
grain  size  efficiently  (but  not  necessarily  vice-versa).  Using  the  Pi  model,  it  is  possible 
to  measure  the  cost  of  implementing  different  operations  in  various  programming 
models  in  terms  of  the  primitive  operations  on  the  PMI.  The  translation  between  PMI 
and  actors  is  one-to-one.  It  is  worth  noting  that  costs  measured  in  terms  of  the  PMI 
model  apparently  closely  correspond  to  actual  machine  costs. 

3.5  A  comparison  of  universal  models 

Table  2  gives  a  summary  of  the  three  different  universal  models  we  have  considered. 
As  is  evident  from  all  the  models,  some  attributes,  of  parallel  computation  are 
mandatory  to  any  model  irrespective  of  whether  it  has  an  analytical  or  architectural 
flavour,  i.e.,  memory,  communication  and  synchronization.  However,  the  degree  of 
specification  with  respect  to  other  attributes  depends  largely  on  the  level  of  abstraction 
that  the  model  is  trying  to  achieve. 

3.6  A  hierarchical  view  of  parallel  computation 

Good  scalability  characteristics  are  achieved  only  if  the  problem  size  scales  up  with 
the  number  of  processors.  In  other  words,  merely  increasing  the  number  of  processors 
keeping  the  problem  size  constant  is  expensive.  Furthermore,  scalability  and  complexity 
analysis  on  these  models  must  also  consider  the  implications  of  the  algorithm  and 
its  mapping  on  the  underlying  architecture  (see,  for  example,  Singh  et  al  1991).  A 
hierarchical  view  of  a  parallel  machine  is  shown  in  figure  7. 

The  purpose  of  the  hierarchical  model  is  to  provide  layers  of  abstraction  in  a 
parallel  system.  When  dealing  with  high  level  issues  like  algorithm  design  or 
programming  language  issues,  we  would  like  to  abstract  away  from  implementation 
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programming  models 


memory  memory 

architectures 

Figure  7.  A  hierarchical  view  of  parallel  systems:  Existing  models  do  not  consider 
the  implications  of  algorithm  design  in  the  scalability  and  complexity  analysis. 

or  architectural  details  like  the  communication  subsystem  or  resource  management. 
However,  in  order  to  discuss  resource  management  issues  without  referring  to  any 
particular  programming  language  or  architecture,  we  must  isolate  features  specific  to 
parallelism  from  issues  specific  to  the  framework  in  which  parallelism  is  being 
exploited.  The  hierarchical  view  in  figure  7  illustrates  the  different  levels  at  which  a 
parallel  system  can  be  viewed  for  different  purposes  (Venkatasubramanian  1991). 

We  now  consider  parallel  resource  management  with  the  actor  model  as  the 
programmer’s  view  of  the  world  and  the  parallel  machine  interface  as  the  underlying 
model  of  a  highly  concurrent  machine. 


4.  Resource  management  for  scalable  systems 

In  this  section,  we  focus  on  the  actual  implementation  of  scalable  parallel  programs 
on  a  highly  parallel  machine.  In  particular  we  elaborate  on  techniques  for  the  effective 
utilization  of  hardware. 

Resources  refer  to  the  hardware  and  software  components  of  a  computer  system 
which  are  required  in  order  to  solve  a  specific  problem.  Resource  management  involves 
techniques  and  mechanisms  used  to  efficiently  allocate,  utilize  and  coordinate  these 
resources.  Parallelism  creates  new  complexities  for  resource  management,  for  example, 
the  communication  network  is  one  of  the  most  critical  of  execution  resources.  In  fact, 
a  bottleneck  is  scaling  concurrent  computers  is  not  limitations  in  the  computational 
power  of  individual  processors,  but  the  costs  and  delays  associated  with  transferring 
information  from  one  processor  to  another.  Thus  resource  management  strategies 
must  try  to  reduce  the  communication  traffic. 

Another  source  of  complexity  in  parallel  systems  is  that  different  resources  may 
be  needed  simultaneously  and  these  resources  must  all  be  available.  Furthermore,  in 
a  large  scale  concurrent  system,  dynamic  allocation  of  resources  is  necessary  because 
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computations  may  need  to  be  automatically  divided  into  large  numbers  of  sub¬ 
computations.  In  actor  systems,  such  division  happens  each  time  concurrent  sub¬ 
requests  are  made  as  a  result  of  processing  a  given  request  (Agha  &  Hewitt  1987). 
With  every  fork  in  the  computation,  execution  resources  must  be  provided  to  the 
sub-computations  created.  Often  the  behavior  of  sub-computations  cannot  be 
determined  in  advance.  Different  strategies  for  subdividing  resources  lead  to  different 
results.  An  improper  allocation  may  lead  to  a  condition  where  a  useful  subcomputation 
cannot  proceed  due  to  the  non-availability  of  resources,  called  the  dangling  sub¬ 
computation  problem. 

Dynamically  spawning  off  tasks  may  give  rise  to  expanding  resource  requirements 
limited  only  by  their  physical  availability.  By  serializing  executions,  resources  can  be 
exclusively  allocated  to  the  executing  process  and  there  are  no  resource  contentions. 
However,  little  parallelism  is  exploited  by  this  scheduling  strategy.  The  other  extreme 
aims  at  extracting  all  the  parallelism  inherent  in  the  application  by  using  a  suitable 
programming  paradigm.  In  such  cases,  studies  have  determined  that  many  programs 
are  “embarassingly  parallel”  (Sargeant  1986).  Excessive  parallelism  leads  to  inordinate 
resource  utilization  and  may  have  an  even  more  serious  outcome  -  deadlock  due  to 
unavailability  of  resources  for  any  of  the  parallel  tasks  to  continue  execution.  There 
is  an  obvious  crossover  point  between  the  degree  of  parallelism  exploited  and  effective 
resource  utilization  in  a  concurrent  system.  The  goal,  then,  is  to  be  able  to  set  up  a 
flexible  throttle  which  will  allow  us  to  increase  or  reduce  the  degree  of  parallelism  at 
will  (Arvind  &  Culler  1986;  Sargeant  1986). 

Another  aspect  of  resource  management  is  controlling  the  use  of  resources  at  the 
application  level.  Some  problems  exhibit  an  exponential  growth  in  the  number  of 
possible  paths  which  could  lead  to  a  solution;  it  is  infeasible  to  explore  all  of  them. 
This  in  turn  means  that  differing  amounts  of  resources  must  be  allocated  to  different 
paths.  The  assessment  of  how  fruitful  a  particular  path  is  changes  over  time  as  one 
assesses  the  intermediate  results  along  the  path.  This  is  one  reason  resources  need  to 
be  controlled  and  re-allocated  dynamically. 

Other  complications  in  a  real  implementation  arise  from  the  presence  of  prioritized 
execution  of  processes,  conflict  resolution,  deadlock  handling  and  synchronization. 
Effective  resource  management  in  parallel  machines  is  a  combination  of: 

Static  analysis :  The  text  of  the  program  is  analysed  before  execution  at  compile-time 
to  detect  infomation  such  as  useless  concurrency  via  dependence  analysis,  rough 
estimates  of  resource  requirements,  data  placement  and  partitioning. 

Heuristics:  It  is  often  sufficient  to  approach  the  resource  management  problem 
conservatively.  In  fact,  mechanisms  implemented  to  guarantee  completeness  or  totality 
may  degrade  machine  performance,  thereby  defeating  the  purpose  of  an  efficient 
resource  management  strategy.  Heuristics  designed  to  achieve  effective  resource 
management  concentrate  on  detecting  sections  of  program  execution  critical  to 
performance  and  target  their  restructuring  efforts  at  those  sections  where  the  payoffs 
would  be  significantly  large. 

Reconfiguration  strategies:  The  system  must  be  capable  of  dynamically  detecting 
“congestion  points”  and  handling  them,  when  appropriate,  by  alteration  of  the  existing 
system  configuration. 

Resource  management  issues  can  be  divided  into  actor/process  management, 
memory  management  and  I/O  (input-output)  management. 


Scalable  concurrent  computing 


209 


4.1  Actor  management 

Actor  placement  and  migration  is  governed  by  mechanisms  implemented  to  handle 
locality  and  load  balancing. 

4.1a  Locality'.  A  process  in  one  processing  unit  may  need  to  communicate  with,  a 
process  or  object  on  another  processing  unit.  For  example,  in  figure  8,  processes  PI 
and  P3  on  different  processing  units  may  need  to  communicate  with  each  other. 
Similarly,  processes  P2  and  P4  also  interact  with  each  other.  The  allocation  of 
processes  as  illustrated  in  figure  8  reduces  processor  utilization  caused  by  tasks  waiting 
for  the  result  of  a  communication.  Frequent  communication  between  processes  on 
different  processors  also  increases  network  contentions.  Minimizing  network  traffic 
is  important  even  in  the  presence  of  high  speed  networks,  where  the  network  bandwidth 
is  the  major  limiting  constraint  on  communication.  Locality  directed  scheduling 
policies,  as  in  figure  9,  reduce  interprocessor  communication,  thereby  decreasing 
network  traffic. 

Memory  locality  is  a  property  of  the  pattern  of  a  program's  references  to  memory. 
References  which  are  tightly  grouped  together  in  terms  of  addresses  are  said  to  have 
spatial  locality.  References  close  together  in  time  are  said  to  have  temporal  locality. 
Three  distinct  classes  of  locality  can  be  identified  in  user  programs: 

(1)  temporary  objects  that  are  usually  intermediates  during  computation; 

(2)  fairly  long  lived  data  structures  whose  lifetime  is  a  significant  portion  of  the 
program’s  lifetime; 

(3)  permanent  structures  such  as  the  runtime  system’s  routines  and  structures. 

The  above  hierarchy  of  relative  object  lifetimes  is  well  portrayed  and  exploited  in  the 
generational  garbage  collection  schemes  discussed  later. 

4.1b  Scheduling :  The  scheduling  problem  in  multiprocessor  machines  is  that  of 
distributing  multiple  threads  of  control  to  execute  on  the  available  processing 
resources.  Threads  which  share  a  lot  of  data  and  threads  which  communicate 
frequently  might  yield  better  throughput  if  scheduled  to  execute  on  the  same  processor. 
But  this  means  that  the  execution  of  these  tasks  is  serialized,  inhibiting  possible 
parallelism. 

A  straightforward  policy  is  to  statically  schedule  tasks  to  execute  on  specific 
processors  despite  the  fact  that  better  choices  could  be  made  dynamically  after  the 
execution  scenario  is  more  well  defined.  An  alternative  is  to  perform  dynamic 
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Figure  8.  A  system  that  does 
not  exhibit  locality.  Communi¬ 
cation  which  occurs  across 
processors  blocks  the  network 
which  is  a  critical  resource. 
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Figure  9.  A  system  that  ex¬ 
ploits  locality.  Locality  based 
scheduling  strategies  avoid  un¬ 
necessary  communication  over¬ 
head. 


scheduling  by  having  a  global  pool  of  tasks  which  need  to  be  executed,  from  which 
processors  can  pick  their  next  to  execute.  However,  a  centralized  job  queue  may 
disrupt  the  locality  inherent  in  the  application. 

4.1c  Load  balancing:  Load  balancing  is  the  task  of  keeping  the  processors  of  a 
parallel  machine  uniformly  busy.  Figures  10  and  11  illustrate  this  concept. 

For  effective  load  balancing,  each  processor  needs  to  know  the  degree  of  load  in 
every  other  processor,  or  at  least,  some  controller  needs  to  maintain  load  information 
in  order  to  make  load  balancing  decisions.  In  the  latter  case,  there  is  likely  to  be  a 
contention  for  this  centralized  resource  (i.e.  the  controller).  If  there  is  a  static 
interconnection  scheme  between  processors  in  a  system,  each  processor  need  only 
maintain  nearest  neighbor  information. 

Locality  and  load-balancing  are  contradictory  constraints  that  need  to  be  weighed 
against  each  other  for  determining  an  optimal  tradeoff  between  dispersion  and 
aggregation  (Athas  1987).  Locality  aims  at  placing  related  objects  in  close  physical 
proximity,  thereby  mitigating  the  frequent  communication  costs.  Load  balancing 
efforts  are  geared  toward  splitting  and  distributing  work  and  do  not  encourage 
“clubbing”  of  computational  nodes  or  threads.  In  order  to  make  effective  use  of 
resources,  we  must  exploit  the  right  amount  of  locality  needed  to  mask  the 
communication  latency.  The  degree  of  object  mobility  required  to  achieve  this  balance 
and  mechanisms  to  deal  with  this  have  been  explored  in  Jul  (1989). 

4. Id  Network  activity:  An  important  concern  of  network  behavior  in  a  parallel 
system  is  that  the  links  of  the  network  should  be  kept  uniformly  busy  with  data 


processing 

unit 


Figure  10.  Processors  before  load  balancing.  Note  that  the  throughput  of  the 
system  is  limited  by  its  slowest  component.  Processes  within  a  processing  unit 
execute  sequentially  and  the  heavily  loaded  processor  becomes  a  bottleneck. 
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Figure  11.  Processors  after  load  balancing.  The  uniform  distribution  of  comput¬ 
ational  units  improves  the  system  throughput. 

objects.  Non-uniform  distribution  of  objects  may  result  in  bottlenecks  at  frequently 
accessed  nodes,  thereby  delaying  other  communications.  Often  a  process  will  create 
a  stream  of  information  used  by  another  process.  Such  processes  are  called  producer 
and  consumer  processes  respectively.  An  important  factor  for  proper  utilization  of  the 
network  is  that  message  dispatch  and  reception  rates  of  a  producer  and  its  consumer 
of  the  communication  be  compatible.  Consider  the  scenario  where  we  have  a 
producer-consumer  relationship  between  objects.  In  order  to  exploit  concurrency, 
we  assign  them  to  different  nodes.  If  these  nodes  are  adjacent  to  each  other  in  a  mesh, 
we  have  a  single  link  between  them  and  this  may  result  in  non-uniform  traffic  in  the 
network.  To  avoid  this  problem,  we  can  place  the  two  objects  on  nodes  which  are  far 
apart.  However,  note  that  message  reception  at  the  consumer  is  serialized  and  placing 
the  nodes  far  apart  in  the  network  would  block  the  network  which  is  a  shared  resource 
in  the  system.  Thus  there  is  a  contradictory  argument  that  presses  for  locality.  If  the 
producing  and  consuming  rates  are  well-matched,  we  could  exploit  concurrency 
without  excessive  traffic  on  the  network. 

4.1e  A  distributed  namespace:  We  assume  that  communication  in  a  distributed 
memory  system  can  be  modelled  as  a  series  of  message  sends  and  that  messages  are 
routed  by  means  of  a  fast,  efficient  routing  network  to  their  destinations.  Local 
communication  is  accomplished  through  primitive  messages5.  We  advocate  having 
a  uniform  address  space  across  all  nodes  in  a  network.  All  entities  in  the  system  are 
referred  to  by  a  virtual  name  that  is  uniform  over  the  entire  system,  giving  us  a 
uniform  namespace  over  the  entire  system.  A  uniform  namespace  brings  us  additional 
flexibility  in  resource  handling:  it  permits  us  to  migrate  objects  to  support  actor 
placement,  migration  and  garbage  collection.  The  cost  of  this  flexibility  is  the  overhead 
associated  with  name  translation  (virtual  name  to  physical  address).  Each  access  to 
the  object  must  query  the  translation  table  and  it  is  therefore  vital  that  all  translations 
be  completed  with  a  relatively  low  latency.  Obviously,  we  should  avoid  virtual  name 
to  physical  address  translation  for  objects  which  reside  locally  and  as  far  as  possible 
for  remote  objects  through  intelligent  compilation  techniques  and  optimizations. 

4.2  Memory  management 

Storage  management  is  used  to  allocate  space  for  an  object  when  it  is  created  and 
subsequently  reuse  this  allocation  space  when  it  is  no  longer  needed.  Memory  which 
is  not  accessible  is  referred  to  as  garbage  and  the  reclamation  of  garbage  is  garbage 


5  Primitive  messages  are  messages  that  a  processor  sends  to  itself. 
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collection.  An  efficient  storage  management  scheme  plays  a  crucial  role  in  enhancing 
the  efficiency  of  a  programming  system.  The  two  major  issues  involved  are  storage 
allocation  and  reclamation.  Automatic  storage  management  schemes  require  that  the 
runtime  system  be  capable  of  recognizing  a  shortage  of  memory  and  reclaiming  unused 
memory  for  reallocation.  The  actor-based  storage  model  adopted  here  assumes  the 
following  operating  systems  support  services: 

(1)  a  mechanism  that  can  allocate  and  deallocate  contiguous  blocks  of  memory; 

(2)  abstractions  needed  to  access  objects  in  memory  with  a  single  virtual  address 
across  a  distributed  global  object  namespace; 

(3)  effective  support  for  relocation  of  objects  either  within  the  heap  or  across  the 
network  (object  migration). 

4.2a  Garbage  collection:  There  have  been  a  number  of  schemes  and  algorithms  for 
performing  garbage  collection  (Ungar  1984,  pp.  157-167;  Baker  &  Hewitt  1987).  The 
following  steps  are  either  implicitly  or  explicitly  performed  in  all  the  algorithms. 

(1)  Identification  of  accessible  objects; 

(2)  reclamation  of  the  inaccessible  memory  objects; 

(3)  compaction  of  memory  to  improve  locality. 

Some  of  the  traditional  garbage  collection  algorithms  take  time  proportional  to  the 
size  of  memory  which  makes  them  inefficient  as  the  size  of  memory  increases.  The 
additional  time  taken  to  detect  and  reclaim  all  the  unreachable  memory  cells  may 
also  destroy  the  interactive  response  of  the  system.  Storage  reclamation  involves  both 
temporal  and  spatial  overhead.  In  addition,  stringent  time  restrictions  imposed  by 
interactive  programs  necessitates  efficient  memory  management  mechanisms. 
General  requirements  of  a  garbage  collection  algorithm: 

(1)  good  interactive  response; 

(2)  capability  to  reclaim  circular  structures; 

(3)  requires  minimal  hardware  support; 

(4)  meshes  well  with  virtual  memory; 

(5)  causes  only  a  small  impact  on  execution  overhead. 

With  the  help  of  operating  system  services,  a  number  of  low-level  memory 
management  details  can  be  abstracted  thus  enabling  us  to  design  a  generalized  garbage 
collection  algorithm.  We  will  now  briefly  describe  the  traditional  GC  mechanisms  and 
extend  them  to  parallel  machines. 

In  Reference  Counting  schemes,  every  cell  (object)  in  memory  is  associated  with  a 
field  called  the  reference  count  which  is  a  count  of  the  number  of  references  to  the 
cell.  The  reference  count  of  a  cell  is  continuously  updated  as  pointers  to  the  cell  are 
created  or  destroyed.  When  the  reference  count  of  a  cell  becomes  0,  the  object  is  no 
longer  referenced  and  can  be  collected.  Reference  counting  mechanisms  are  incremental 
in  nature.  Therefore,  computation  is  not  interrupted  for  a  significant  period  of  time. 
A  major  flaw,  however,  is  their  inability  to  handle  circular  garbage  (because  the 
reference  count  of  any  cell  in  a  circular  structure  can  never  be  zero).  Also,  when  an 
object  is  collected,  additional  work  is  done  in  decrementing  the  reference  counts  of 
any  child  cells.  This  process  can  be  potentially  unbounded.  There  have  been  attempts 
to  modify  reference  counting  algorithms  to  accommodate  various  implementation 
issues  (Deutsch  &  Bobrow  1976;  Bevan  1987,  pp.  273-288;  Watson  &  Watson 
1987,  pp.  432-443;  Goldberg  1989,  pp.  313-320;  Ichisugi  &  Yonezawa  1990). 
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The  Mark  and  Sweep  algorithm  detects  garbage  by  halting  the  mutator  (application 
program)  and  then  performing  two  (optionally  three)  phases.  A  predetermined  root 
set  is  used  to  mark  all  the  reachable  objects  in  the  first  phase,  the  second  phase 
reclaims  dead  objects  one  at  a  time  by  walking  down  the  entire  memory  space  and 
a  third,  optional  phase  compacts  memory  to  avoid  fragmentation.  This  algorithm 
collects  all  available  garbage  and  uses  only  one  extra  bit  per  word  for  reclamation 
purposes.  However,  the  running  time  of  mark  and  sweep  is  proportional  to  the  size 
of  memory,  and  hence  this  method  is  impractical  for  large  memory  sizes. 

The  Stop-and-Copy  garbage  collector  divides  the  available  memory  in  a  system 
into  two  regions  -  the  from-space  and  the  to-space.  Only  one  of  these  regions,  i.e., 
the  from-space,  is  available  to  the  mutator  at  any  time.  The  copying  algorithm  (Fenichel 
&  Yochelson  1969;  Cheney  1970)  traverses  all  the  reachable  records  in  the  from-space 
starting  at  a  root  pointer  set  and  copies  these  records  into  a  separate  area  of  memory, 
the  to-space,  that  is  currently  unused  by  the  mutator.  Forwarding  pointers  are  placed 
in  the  from-space  to  redirect  any  references  to  the  swapped  record.  At  the  end  of 
collection,  the  only  records  in  to-space  are  the  reachable  ones.  The  from-  and  to-space 
pointers  are  swapped  and  the  mutator  now  works  on  the  new  from-space.  Note  that 
the  amount  of  work  performed  is  proportional  to  the  number  of  reachable  cells 
whereas  the  traditional  mark-and-sweep  algorithms  take  time  proportional  to  the 
size  of  memory.  The  performance  of  the  copying  algorithm  is  unaffected  as  the  size 
of  memory  increases. 

Generational  Garbage  Collection  (Liebermann  &  Hewitt  1983)  is  an  extension  of 
the  copying  garbage  collector  that  takes  advantage  of  two  significant  observations: 

(1)  high  rate  of  infant  mortality  -  A  young  object  is  more  likely  to  die  than  an  object 
that  has  survived  for  a  while; 

(2)  newer  records  are  more  likely  to  point  to  older  records  than  vice-versa.  An  older 
record  points  to  a  newer  record  only  if  it  is  altered  (reassigned)  after  it  is  initialized. 
This  flavour  of  reassignment  is  rather  unlikely  in  many  languages.  Generational 
garbage  collection  is  claimed  to  be  highly  suitable  for  such  languages. 

Records  with  a  similar  age  are  grouped  together  in  a  contiguous  area  of  memory. 
Once  we  have  segregated  objects  on  the  basis  of  their  age;  reclamation  efforts  can  be 
concentrated  on  the  youngest  generation.  The  drawback  of  this  scheme  is  what  is 
known  in  garbage  collection  literature  as  the  Tenuring  Problem.  Old  objects  that  die 
may  take  a  long  time  to  be  reclaimed.  Even  worse,  they  may  never  be  reclaimed  until  the 
occurrence  of  an  offline  collection.  If  objects  are  promoted  too  quickly,  there  are 
more  objects  present  in  higher  generations.  Collecting  higher  generations  results  in 
significant  pauses  and  clever  promotion  policies  are  necessary  to  mask  this  unavoidable 
tradeoff. 

4.2b  Multiprocessor  GC  algorithms:  The  desire  to  exploit  concurrency  among  tasks 
in  a  job  complicates  storage  management.  This  is  due  to  the  need  for  synchronization 
between  the  various  tasks  in  order  to  ensure  their  correctness  and  consistency. 
Programming  environments  for  highly  parallel  machines,  it  appears,  are  largely 
centred  around  dynamic  resource  management  schemes  which  require  sophisticated 
memory  allocation  and  reclamation  schemes.  Furthermore,  one  can  think  of  storage 
reclamation  itself  as  a  concurrent  set  of  processes. 

All  this  concurrency  requires  coordination  between  the  parallel  subtasks  of  a  single 
job  which  in  turn  requires  efficient  and  reliable  communication.  Deficiencies  of 
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traditional  algorithms  in  a  parallel  setting  have  been  detailed  upon  in  Hudak  &  Keller 
(1982,  pp.  168-178).  Many  of  the  insufficiencies  arise  from  the  fact  that  the  traditional 
algorithms  halt  the  computation  process,  preventing  realtime  response  by  the  system. 
In  other  words,  the  collector  (thread  of  control  that  is  responsible  for  garbage 
collection)  is  initiated  after  the  mutator(s)  (threads  of  control  corresponding  to  the 
application  program)  have  been  halted. 

The  main  overhead  in  parallel  GC  occurs  in  maintaining  intraprocessor  and 
interprocessor  references  to  an  object.  If  this  is  not  done,  it  is  possible  for  an  object 
to  be  collected  even  while  a  reference  to  it  exists,  violating  the  correctness  criteria  of 
the  algorithm. 

•  GC  in  shared  memory  machines 

One  of  the  earliest  parallel  mark-and-sweep  algorithms  on  a  shared  memory  has  been 
discussed  in  Steele  (1975).  A  dedicated  storage  reclamation  processor  handles  garbage 
collection  of  a  memory  space  that  it  shares  with  a  mutator.  In  the  general  case,  this 
algorithm  could  be  extended  to  handle  any  number  of  mutators  that  share  a  single 
address  space.  Hence,  this  algorithm  is  applicable  to  a  tightly  coupled  system.  The 
correctness  proof  for  a  similar  mark-and-sweep  based  algorithm  was  developed  in 
Dijkstra  et  al  (1978). 

Locking  mechanisms  must  be  implemented  to  prevent  two  processors  from 
simultaneously  manipulating  the  same  memory  location.  Extensions  of  the  stop-and- 
copy  algorithm  and  generational  GC  implemented  on  shared  memory  machines  are 
presented  (Appel  et  al  1988,  pp.  11-20;  Pallas  &  Ungar  1988,  pp.  268-277). 

•  GC  in  distributed  memory  machines 

In  distributed  architectures,  each  processing  unit  is  associated  with  its  own  local 
memory.  A  global  communication  network  connects  the  different  processing  units 
together.  As  intra-processor  communication  is  orders  of  magnitude  faster  than 
interprocessor  communication,  we  are  forced  to  exploit  locality  and  manipulate  data 
in  the  local  memory  in  a  distributed  memory  machine. 

There  are  two  levels  of  garbage  collection  on  distributed  memory  machines,  i.e., 
local  and  global  garbage  collection.  Global  garbage  collection  is  a  systemwide  GC 
involving  all  processors  in  the  system.  Local  garbage  collection  is  carried  out  locally 
on  each  processor  without  inhibiting  the  processing  activities  on  another  processor. 
There  are  costs  associated  with  both  local  and  global  GC.  The  interval  of  a  global 
garbage  collection  is  dictated  by  the  first  processing  unit  that  needs  memory.  Other 
processors  have  to  cooperate  even  if  they  have  sufficient  local  memory  to  continue 
work.  The  overhead  of  synchronization  -  orchestrating  a  global  start  and  stop  could 
be  substantial,  especially  when  the  collection  process  has  to  be  real  time.  In  other 
words,  the  deviations  in  the  time  taken  by  different  processing  elements  to  run  out 
of  space  must  be  more  well-balanced.  Hardware  techniques  like  logic  and  signal  lines 
or  software  mechanisms  like  barrier  synchronization  can  be  used  to  reduce  the 
synchronization  overhead.  Local  garbage  collection  requires  a  reference  management 
table  to  determine  whether  an  object  is  wholly  local  or  not.  The  entries  in  this  table 
should  be  periodically  updated  and  eventually  reclaimed. 

The  authors’  group  is  currently  studying  algorithms  for  distributed  memory 
management  and  a  hierarchical  approach  to  memory  management  has  been  presented 
in  Venkatasubramanian  (1991). 
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4.3  I/O  management 

While  advances  in  processor  architecture  and  memory/'register  management  strategies 
have  enhanced  performance  in  parallel  machines,  commensurate  improvements  in  I/O 
performance  have  not  been  made.  The  disparity  between  the  performance  of  the  I/O 
subsystem  and  the  other  subsystems  in  current  parallel  architectures  must  be  reduced. 
This  problem  is  acute  in  the  data  parallel  model  explained  earlier,  where  the  physical 
dimensions  of  the  machine  pose  a  limitation  on  the  amount  of  information  which 
can  enter  or  leave  the  surfaces  of  the  machine. 

In  a  three-dimensional  physical  world  with  n  processors  along  each  dimension,  a 
cube  has  a  total  of  n3  processors  with  6  faces.  As  all  the  input/output  to  a  system 
configured  as  a  cube  occurs  through  the  face  of  the  cube,  the  n3  processors 
communicate  with  the  outside  world  through  the  6 n2  processors  on  the  face  of  the 
cube.  It  must  be  noted  that  this  is  the  hypothetical  best  case  for  a  cube  configured 
system.  This  yields  an  I/O  limitation  of  6 n2/3  for  loading  information  from  external 
sources  to  the  processors  in  a  cube  simultaneously  from  all  the  6 n2  entry  points  into 
the  cube.  To  illustrate  the  above  discussion,  consider  a  cube  with  a  million  processors. 
Thus  an  order  of  17  basic  time  units  would  be  needed  to  load  one  unit  of  information 
into  each  of  the  processors  in  a  million  processor  multicomputer  with  optimal  I/O. 

It  is  possible  to  avoid  the  I/O  bottleneck  for  specific  applications  (Kung  1986, 
pp.  49-54),  but  the  bottomline  remains  that  we  must  focus  efforts  on  improving  the 
I/O  subsystem  performance  to  the  level  that  it  does  not  affect  throughput.  Real-time 
visualization  and  animation  are  examples  of  I/O  limited  applications. 

Interoperability  in  programming  environments,  i.e.,  the  ability  to  switch  from  one 
programming  framework  to  another,  will  play  an  important  role  in  future  distributed 
systems.  Multiparadigm  programming  environments  are  software  technologies  that 
will  permit  a  smooth  transition  among  programming  paradigms. 


5.  Multiparadigm  programming  environments 

In  order  to  address  the  problems  of  concurrent  computation,  a  number  of  important 
programming  paradigms  have  been  developed.  Our  research  focuses  on  providing  a 
flexible  basis  for  providing  efficient  execution  of  important  programming  constructs 
inspired  by  different  linguistic  paradigms  in  a  single  framework. 

5.1  Declarative  programming 

In  declarative  programming ,  there  is  no  notion  of  instruction  sequencing  -  we  “declare” 
what  is  to  be  computed  rather  than  how  this  computation  must  be  done.  Separating 
logic  from  control  is  an  important  step  toward  achieving  abstraction  and  modularity. 
All  control  or  computational  issues  governing  machine  behavior  are  relegated  to 
the  language  implementation.  Two  approaches  to  declarative  programming  are 
functional  and  logic  programming. 

Declarative  programming  is  a  more  radical  approach  to  concurrent  computing. 
Unlike  the  CSP  model  where  the  concurrency  and  state  transformations  are  explicit, 
declarative  models  are  implicitly  parallel.  An  important  feature  of  declarative 
languages  is  that  variables  in  these  languages  correspond  to  values;  they  have  no 
notion  of  a  computational  history  or  state.  The  absence  of  state  resolves  a  number 
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of  determinacy  issues,  for  example,  there  are  no  cache  consistency  problems.  However, 
the  ability  to  create  shared,  modifiable  data  structures  cannot  be  cleanly  integrated 
into  declarative  models.  State  must  be  perpetually  passed  through  procedural 
abstractions  such  as  functions  or  relations.  This  is  inadequate  for  programming  in 
the  large  system  which  requires  support  for  data  abstraction. 

Functional  and  concurrent  logic  programming  suggest  important  linguistic  constructs 
and  programming  techniques.  Important  language  features  in  these  paradigms,  such 
as  higher-order  functions  and  pattern-matching,  can  be  defined  in  terms  of  actor 
primitives  (Agha  1989,  pp.  1-19)  and  used  as  needed. 

5.2  Concurrent  object  oriented  programming 

Object  Oriented  Programming  (OOP)  is  another  programming  paradigm  whose  roots 
go  as  far  back  as  Simula  (a  simulation  language),  with  significant  contributions  made 
by  developments  in  languages  such  as  Smalltalk  and  Flavors  (Wegner  1990).  Modern 
object  oriented  programming  has  been  influenced  by  a  number  of  individuals  and 
concepts  which  makes  it  difficult  to  give  a  universally  acceptable  definition  of  object 
oriented  programming.  A  simplified  definition  is  given  in  Madsen  (1987): 

A  program  execution  is  regarded  as  a  physical  model ,  simulating  a  behavior  of  either 
a  real  or  imaginary  part  of  the  world. 

Here,  the  universe  is  viewed  as  being  composed  of  many  passive  entities  with  functions 
which  transform  them.  Structuring  mechanisms  in  OOP  include  classification  and 
specialization.  Classification  enables  us  to  group  diverse  objects  based  on  some 
common  characteristics.  Specialization  supports  the  specification  of  objects  as 
modifications  of  existing  specifications.  Abstraction  and  inheritance  are  language 
mechanisms  which  allow  this  structuring.  Object-oriented  programming  provides 
encapsulation:  the  user  sees  a  simplified  interface  consisting  of  a  set  of  operations 
permissible  on  a  structure. 

5.2a  Abstraction:  Abstraction  is  a  powerful  tool  which  allows  programmers  to 
manage  complexity  by  dealing  with  high  level  concepts  before  dealing  with  their 
representation.  Object  oriented  languages  are  designed  to  support  both  data  and 
procedural  abstraction.  Procedural  abstraction ,  common  to  all  modern  programming 
languages,  allows  different  actions  to  be  grouped  into  a  single  name.  Data  abstraction 
is  a  data  structuring  and  packaging  mechanism  in  which  small  subsystems  which 
have  certain  common  characteristics  are  grouped  together  to  form  a  larger  subsystem. 

In  other  words,  an  object  is  a  unit  of  encapsulation  which  hides  from  the  user, 
implementation  details  (i.e.,  the  representation)  of  the  data  structure  as  well  as  the 
operations  performed  on  it.  The  user  need  only  be  concerned  about  what  operations 
may  be  performed  on  the  object.  An  advantage  of  object-oriented  programming  is 
the  easy  maintenance,  modifiability  and  portability  of  the  code.  Actors  are  similar  to 
objects  in  that  they  encapsulate  a  local  state  and  a  set  of  operations.  However,  actors 
differ  from  objects  in  several  ways.  Perhaps  the  most  important  distinction  is  that 
actors  are  inherently  concurrent  -  many  actors  may  be  active  at  the  same  time.  By 
contrast,  in  sequential  object  oriented  languages,  only  one  object  may  be  active  at  a 
time. 

5.2b  Inheritance:  Inheritance  is  a  mechanism  which  facilates  code  sharing  and 
reusability.  Code  is  structured  in  terms  of  superclasses  and  subclasses.  Inheritance 
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mechanisms  allow  a  class  to  borrow  the  functionality  (methods)  of  its  superclass  and 
specialize  it.  Code  sharing  can  lead  to  name  conficts  -  the  same  identifier  may  be 
bound  to  procedures  in  an  object  and  in  its  class.  Object  oriented  languages  differ  in 
how  such  name  conflicts  are  handled.  One  proposal,  advanced  by  Jagannathan  and 
Agha  (see,  for  example,  Agha  &  Jagannathan  1991),  is  to  allow  programmers  to  define 
different  possible  inheritance  mechanisms  in  a  single  linguistic  framework  by  providing 
the  ability  to  define  and  manipulate  the  underlying  name  bindings. 

The  power  of  Concurrent  Object  Oriented  Programming  (COOP)  comes  from  the 
ability  to  develop  abstractions  which  hide  the  details  and  complexities  of  concurrency. 
COOP  systems  are  larger  systems  of  communicating  actors  and  they  can  be  built  in 
terms  of  actor  primitives.  The  inherent  locality  in  distributed  systems  is  modelled 
naturally  through  a  modular  design.  A  judicious  choice  of  abstractions  and  primitives 
which  express  concurrency  in  a  manner  transparent  to  the  user  is  of  crucial  importance.  It 
has  been  observed  that  the  main  contribution  of  OOP  to  concurrency  is  to  provide 
reusable  abstractions  which  can  hide  the  low  level  details  of  partitioning,  synchroniz¬ 
ation  and  communication  from  the  user  (Lim  &  Johnson  1989).  Thus,  COOP  systems 
illustrate  the  power  of  the  actors  as  building  blocks.  Another  approach  to  COOP  can 
be  found  in  ABCL  (An  Actor  Based  Concurrent  Language)  (Yonezawa  1990). 

5.3  Reflection 

In  the  normal  course  of  execution  of  a  program,  a  number  of  objects  are  implicit.  In 
particular,  the  interpreter  or  compiler  being  used  to  evaluate  the  code  for  an  object, 
the  text  of  the  code,  the  environment  in  which  the  bindings  of  identifiers  in  an  object 
are  evaluated,  and  the  communication  network  are  all  implicit.  As  one  moves  from 
a  higher  level  language  to  its  implementation  language,  a  number  of  objects  are  given 
concrete  representations  and  can  be  explicity  manipulated  at  the  lower  implementation 
level  (for  a  more  detailed  discussion  see  Agha  1990,  1991,  pp.  1-59). 

The  dilemma  is  that  if  a  very  low  level  language  is  used,  the  advantages  of  abstraction 
provided  in  a  high-level  notation  are  lost.  Alternately,  the  flexibility  of  a  low-level 
language  may  be  lost  in  a  high-level  language.  A  reflective  architecture  addresses  this 
problem  by  allowing  us  to  program  in  a  high-level  language  without  losing  the 
possibility  of  representing  and  manipulating  the  objects  that  are  normally  implicit 
(Maes  1987).  Reification  operators  can  be  used  to  represent  at  the  level  of  the 
application,  objects  which  are  in  the  underlying  architecture.  These  objects  can  then 
be  manipulated  like  any  other  objects  at  the  “higher’  application  level.  Reflective 
operators  may  then  be  used  to  install  the  modified  objects  into  the  underlying 
architecture.  Reflection  thus  provides  a  causal  connection  between  the  operations 
performed  on  this  representation  and  the  corresponding  objects  in  the  underlying 
architecture. 

In  a  COOP  system,  the  evaluator  of  an  object  is  called  its  meta-object.  Reflective 
architectures  in  COOP  have  been  used  to  implement  a  number  of  interesting 
applications.  For  example,  Watanabe  and  Yonezawa  (see  Yonezawa  1991)  have  used 
it  to  separate  the  logic  of  an  algorithm  from  its  scheduling  for  the  purposes  of  a 
simulation:  in  order  to  build  a  virtual  time  simulation,  messages  are  time-stamped 
by  the  meta-object  and  sent  to  the  meta-object  of  the  target  which  uses  the  time-stamp 
to  schedule  the  processing  of  a  message  or  to  decide  if  a  rollback  to  a  previous  state 
is  required.  Thus  the  code  for  an  individual  object  need  only  contain  the  logic  of  the 
simulation,  not  the  mechanisms  used  to  carry  out  the  simulation;  the  specification  of 
the  mechanisms  is  separated  into  the  meta-objects. 
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In  summary,  reflection  provides  a  way  of  integrating  various  language  paradigms. 
By  using  the  underlying  representations,  one  can  define  programming  constructs  and 
their  interrelation  as  first-class  objects.  Thus  a  multiparadigm  programming  environ¬ 
ment  may  be  created  for  heterogeneous  computing.  Components  of  a  system  may 
use  language  constructs  which  are  more  suited  to  the  computations  they  are  carrying 
out. 


6.  Discussion 

There  are  a  number  of  ongoing  efforts  to  achieve  the  fifth  generation  computing 
objective.  The  US  High  Performance  Computing  and  Communication  (HPCC) 
Initiative  (1991-1996),  funded  at  a  level  of  over  4  billion  dollars,  attempts  to  address 
fundamental  problems  in  high  performance  computing.  Major  emphasis  is  placed  on 
the  development  of  hardware  and  software  technologies  required  for  scalable,  parallel 
computing  systems  with  a  performance  of  trillions  of  operations  per  second  on  a  wide 
range  of  important  applications.  The  European  ESPRIT  project  is  placing  emphasis 
on  models  of  computing  based  on  the  human  brain  and  parallel  architectures. 

Similar  efforts  are  being  conducted  in  Japan  in  the  New  Information  Processing 
Technology  funded  by  the  Japanese  Ministry  of  International  Trade  and  Industry. 
Areas  of  research  include  establishing  sound  theoretical  foundations  for  flexible 
information  processing  technologies  using  distributed  and  cooperating  agents  or 
actors,  development  of  massively  parallel  computing  technologies  (a  million  processor 
architecture)  and  new  application  domains  for  such  technologies. 

The  US  HPCC  program  has  identified  a  large  number  of  problems  critical  to  science 
and  engineering,  the  so-called  grand  challenge  problems.  Addressing  these  grand 
challenges  will  require  orders  of  magnitude  increase  in  computational  power.  For 
example,  in  biomedical  investigations,  research  on  genes  causing  cancer  requires 
management  of  data  of  the  order  of  billions  of  molecular  units.  The  fastest 
supercomputers  available  in  the  market  today  will  require  hundreds  of  years  of 
processing  time  to  yield  the  necessary  information.  Similarly,  fundamental  research 
in  physics  and  chemistry,  and  superconductor  research  require  accurate  simulations 
to  predict  the  transformations  of  materials  under  varying  conditions.  Weather 
prediction  and  information  about  severe  changes  in  atmospheric  conditions  are  other 
computationally  intensive  problems. 

Scalable  concurrent  computing  is  a  fundamental  enabling  technology  required  to 
deliver  the  promise  of  high  performance  computing.  We  are  just  beginning  to  address 
some  of  the  problems  involved  in  using  massively  parallel  processing. 
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